私的AI研究会 > OCR_jp

日本語OCR の検証

 日本語認識(OCR) を検証する。

※ 最終更新:2021/04/12 

フリーの公開プログラム

 下記サイトの公開プログラムをテストしてみる。

インストール手順

  1. GitHubからクローン
    $ git clone https://github.com/tanreinama/OCR_Japanease.git
    Cloning into 'OCR_Japanease'...
    remote: Enumerating objects: 122, done.
    remote: Counting objects: 100% (122/122), done.
    remote: Compressing objects: 100% (92/92), done.
    remote: Total 122 (delta 53), reused 77 (delta 23), pack-reused 0
    Receiving objects: 100% (122/122), 3.45 MiB | 19.87 MiB/s, done.
    Resolving deltas: 100% (53/53), done.
    $ cd OCR_Japanease
    $ ls
    LICENSE  nets              report            testshot1.png-detections.png
    misc     ocr_japanease.py  requirements.txt  testshot2.png
    models   readme.md         testshot1.png     testshot2.png-detections.png
  2. 学習済みモデルをダウンロード
    $ wget https://nama.ne.jp/models/ocr_jp-v2.zip
    --2021-04-12 14:00:01--  https://nama.ne.jp/models/ocr_jp-v2.zip
    nama.ne.jp (nama.ne.jp) をDNSに問いあわせています... 112.78.112.176
    nama.ne.jp (nama.ne.jp)|112.78.112.176|:443 に接続しています... 接続しました。
    HTTP による接続要求を送信しました、応答を待っています... 200 OK
    長さ: 180256769 (172M) [application/zip]
    `ocr_jp-v2.zip' に保存中
    
    ocr_jp-v2.zip       100%[===================>] 171.91M  10.6MB/s    in 17s     
    
    2021-04-12 14:00:17 (10.4 MB/s) - `ocr_jp-v2.zip' へ保存完了 [180256769/180256769]
    $ unzip ocr_jp-v2.zip
    Archive:  ocr_jp-v2.zip
      inflating: models/detectionnet.model  
      inflating: models/classifiernet.model
  3. OCRを実行してみる
    mizutu@ubuntu2004dk:~/OCR_Japanease$ python3 ocr_japanease.py testshot1.png
    Traceback (most recent call last):
      File "ocr_japanease.py", line 11, in <module>
        from misc.detection import Detector
      File "/home/mizutu/OCR_Japanease/misc/detection.py", line 5, in <module>
        from sklearn.cluster import OPTICS
    ModuleNotFoundError: No module named 'sklearn'
    • scikit-learn をインストールする
      mizutu@ubuntu2004dk:~/OCR_Japanease$ sudo pip3 install scikit-learn
      Collecting scikit-learn
        Downloading scikit_learn-0.24.1-cp38-cp38-manylinux2010_x86_64.whl (24.9 MB)
           |████████████████████████████████| 24.9 MB 21.4 MB/s 
      Requirement already satisfied: numpy>=1.13.3 in /usr/local/lib/python3.8/dist-packages (from scikit-learn) (1.18.5)
      Collecting joblib>=0.11
        Downloading joblib-1.0.1-py3-none-any.whl (303 kB)
           |████████████████████████████████| 303 kB 41.3 MB/s 
      Collecting scipy>=0.19.1
        Downloading scipy-1.6.2-cp38-cp38-manylinux1_x86_64.whl (27.2 MB)
           |████████████████████████████████| 27.2 MB 27.3 MB/s 
      Collecting threadpoolctl>=2.0.0
        Downloading threadpoolctl-2.1.0-py3-none-any.whl (12 kB)
      Installing collected packages: threadpoolctl, scipy, joblib, scikit-learn
      Successfully installed joblib-1.0.1 scikit-learn-0.24.1 scipy-1.6.2 threadpoolctl-2.1.0
      WARNING: You are using pip version 20.3.3; however, version 21.0.1 is available.
      You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command.
    • scikit-learn のバージョン
      mizutu@ubuntu2004dk:~/OCR_Japanease$ python3
      Python 3.8.5 (default, Jan 27 2021, 15:41:15) 
      [GCC 9.3.0] on linux
      Type "help", "copyright", "credits" or "license" for more information.
      >>> import sklearn
      >>> print(sklearn.__version__)
      0.24.1
      >>> exit()

OCRを実行

  1. プログラムの実行
    $ python3 ocr_japanease.py testshot1.png
    Traceback (most recent call last):
      File "ocr_japanease.py", line 157, in <module>
        main()
      File "ocr_japanease.py", line 34, in main
        ocr_result = get_ocr(d, dpi=args.dpi, use_cuda=(not args.cpu), output_detect_img=args.output_detect_img, low_gpu_memory=args.low_gpu_memory)
      File "ocr_japanease.py", line 72, in get_ocr
        model.load_state_dict(torch.load(det_model))
      File "/home/mizutu/.local/lib/python3.8/site-packages/torch/serialization.py", line 592, in load
        return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
      File "/home/mizutu/.local/lib/python3.8/site-packages/torch/serialization.py", line 851, in _load
        result = unpickler.load()
      File "/home/mizutu/.local/lib/python3.8/site-packages/torch/serialization.py", line 843, in persistent_load
        load_tensor(data_type, size, key, _maybe_decode_ascii(location))
      File "/home/mizutu/.local/lib/python3.8/site-packages/torch/serialization.py", line 832, in load_tensor
        loaded_storages[key] = restore_location(storage, location)
      File "/home/mizutu/.local/lib/python3.8/site-packages/torch/serialization.py", line 175, in default_restore_location
        result = fn(storage, location)
      File "/home/mizutu/.local/lib/python3.8/site-packages/torch/serialization.py", line 151, in _cuda_deserialize
        device = validate_cuda_device(location)
      File "/home/mizutu/.local/lib/python3.8/site-packages/torch/serialization.py", line 135, in validate_cuda_device
        raise RuntimeError('Attempting to deserialize object on a CUDA '
    RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
    • オプション設定がいるみたい
      $ python3 ocr_japanease.py --cpu testshot1.png
      file "testshot1.png" detected in 72 dpi.
      [Block #0]
      コロナウイルスにまけるな
      [Block #1]
      がんばろう
      [Block #2]
      日本

更新履歴

 

参考資料

 

Last-modified: 2021-04-13 (火) 09:26:23