“抄”代码封装腾讯云OCR接口，结合前端事件实现发票图片和PDF两种文件格式识别

yzm130915

最近尝试在简道云上建一个能够自动识别发票信息的费用报销应用，查了一下各大平台的发票API产品，发现腾讯云OCR产品中的增值税发票识别能够识别PDF图片两种格式的文件，并且价格也比较实惠，比较适合。由于以前没有用过腾讯云的产品，就先学习了一下论坛里几篇关于腾讯云函数的文章，发现传参都是使用get方法，但是OCR产品必须使用POST传参。咨询了一下简道云的客服，很失望的发现前端事件不支持腾讯云的POST传参，必须要自建服务器，写代码封装接口。
之前没有用过服务器，也不懂代码和封装接口，就去学习了一下简道云API接口文档，尝试自己动手来实现这个功能；在无数次骚扰鹅厂的客服和简道云的客服之后，终于搞定，现在回头看实现的过程，其实很简单，只需要抄几段代码就可以了，过程如下：

一、准备阶段
1、开通腾讯云轻量应用服务器（新人白嫖1个月），使用UBUNTU系统，在防火墙-规则，添加TCP接口，如9527；

2、开通腾讯云文字识别（https://cloud.tencent.com/document/product/866/34681），开通腾讯云函数；

3、ubuntu已安装python3.8，需安装如下库
安装腾讯云SDK工具包
pip install -i https://mirrors.tencent.com/pypi/simple/ --upgrade tencentcloud-sdk-python
安装requests 和flask
pip3 install requests
pip3 install flask

二、抄代码
1、要抄的第一段代码是腾讯云OCR自动生成的代码，进入文字识别，通过接入指引，打开API 3.0 Explorer 在线接口调试页面；选择票据识别相关接口-增值税发票识别，输入参数中，地域随便选，本地图片要选ImageBase64，前端事件因为推送的只是文件下载链接，所以本案例要选ImageUrl ，要支持PDF格式，IsPdf填True，PdfPageNumber填1，如图所示：

选择python代码，全部COPY

import json
from tencentcloud.common import credential
from tencentcloud.common.profile.client_profile import ClientProfile
from tencentcloud.common.profile.http_profile import HttpProfile
from tencentcloud.common.exception.tencent_cloud_sdk_exception import TencentCloudSDKException
from tencentcloud.ocr.v20181119 import ocr_client, models
try:
cred = credential.Credential("SecretId", "SecretKey")
httpProfile = HttpProfile()
httpProfile.endpoint = "ocr.tencentcloudapi.com"
clientProfile = ClientProfile()
clientProfile.httpProfile = httpProfile
client = ocr_client.OcrClient(cred, "ap-guangzhou", clientProfile)
req = models.VatInvoiceOCRRequest()
params = {
"ImageUrl": "ImageUrl",
"IsPdf": True,
"PdfPageNumber": 1
}
req.from_json_string(json.dumps(params))
resp = client.VatInvoiceOCR(req)
print(resp.to_json_string())
except TencentCloudSDKException as err:
print(err)

复制代码

2、要抄的第二段代码是云函数的，选择通过模版新建，选择图片转文字的python3.6模版，代码如下

# readme : https://cloud.tencent.com/document/product/583/30589
# -*- coding: utf-8 -*-
import os
import logging
import datetime
import base64
import json
from tencentcloud.common import credential
from tencentcloud.common.profile.client_profile import ClientProfile
from tencentcloud.common.profile.http_profile import HttpProfile
from tencentcloud.common.exception.tencent_cloud_sdk_exception import TencentCloudSDKException
from tencentcloud.ocr.v20181119 import ocr_client, models
import sys
print('Loading function')
logging.basicConfig(level=logging.INFO, stream=sys.stdout)
logger = logging.getLogger()
logger.setLevel(level=logging.INFO)
def main_handler(event, context):
logger.info("start main handler")
if "requestContext" not in event.keys():
return {"code": 410, "errorMsg": "event is not come from api gateway"}
if "body" not in event.keys():
return {
"isBase64Encoded": False,
"statusCode": 200,
"headers": {"Content-Type": "text", "Access-Control-Allow-Origin": "*"},
"body": "there is no file from api gateway"
}
# The image format uploaded from the gateway has been encoded with Base64, which can be directly obtained from event['body'].从网关上传的图片格式已经做过Base64，在event['body']里可以直接获取
logger.info("Start to detection")
try:
secret_id = os.environ.get('TENCENTCLOUD_SECRETID')
secret_key = os.environ.get('TENCENTCLOUD_SECRETKEY')
token = os.environ.get('TENCENTCLOUD_SESSIONTOKEN')
cred = credential.Credential(secret_id,secret_key,token)
httpProfile = HttpProfile()
httpProfile.endpoint = "ocr.tencentcloudapi.com"
clientProfile = ClientProfile()
clientProfile.httpProfile = httpProfile
client = ocr_client.OcrClient(cred, "ap-beijing", clientProfile)
req = models.GeneralBasicOCRRequest()
params = '{"ImageBase64":"%s"}'%event['body']
req.from_json_string(params)
resp = client.GeneralBasicOCR(req)
res_ai = json.loads(resp.to_json_string())
res_text = " "
print (len(res_ai["TextDetections"]))
for i in range(len(res_ai["TextDetections"])):
res_text = res_text + str(res_ai["TextDetections"]["DetectedText"])

except TencentCloudSDKException as err:

print(err)

print(res_text)

response = {

      "isBase64Encoded": False,

      "statusCode": 200,

      "headers": {"Content-Type": "text", "Access-Control-Allow-Origin": "*"},

      "body": res_text

}

return response

复制代码

因为都是调用OCR的API接口，这2段代码，重合得非常多，第一段代码在resp = client.VatInvoiceOCR(req)已返回数据，云函数的代码从res_ai = json.loads(resp.to_json_string())开始就是对返回结果的处理，我们可以借鉴并修改一下。
查看一下增值税发票识别的接口文档https://cloud.tencent.com/document/api/866/36210，输出的参数是VatInvoiceInfos，发票的信息是在“name”和"value"的值里，我们需要新处理一下返回结果，新建一个字典，“name”的值为KEY，"value"的值为VALUE；抄橙色部分代码并修改如下

res_str = json.loads(resp.to_json_string())#json对象转字符串

#处理字符串，保留name和value的值，并定义为字典

         res_key=[]

         res_value=[]

         for i in range(len(res_str ["VatInvoiceInfos"])):

res_key.append(res_str["VatInvoiceInfos"]["Name"])

res_value.append(res_str["VatInvoiceInfos"]["Value"])

response=dict(zip(res_key,res_value))

         return response

复制代码

3、抄第三段代码    第1段代码调用端口，第2段代码处理返回结果，第3段就是要一个实例化一个类，参考简道云的文档https://hc.jiandaoyun.com/open/12111，因为同样是前端事件的POST传参，所以部分代码都是可以直接用的，我们copy flask这一段即可

app = Flask(__name__)

@app.route('/fapiao', methods=['POST'])

def jpg_pdf():

image_url = json.loads(request.data).get('image_url')

if image_url != '':

      ocr_data =img_ocr(image_url)

      return json.dumps(ocr_data)

else:

      return None

threading.Thread(target=img_ocr,args=(image_url,)).start()#加一个多线程

if __name__ == '__main__':

app.run(host='0.0.0.0',port=9527)

复制代码

云团 · 发表于 2021-7-21 10:34:40

牛P 学到了！

Onecax · 发表于 2021-7-26 15:34:54

第1次发贴，没有实名验证，不懂规则，篇幅受限，完整的代码无法上传，有疑问的可以留言处理，会尽快回复

morelee · 发表于 2022-1-11 10:54:26

请问一下 UBUNTU哪个版本的比较合适
有ubuntu 18.04 和20.4 的lts版可选

morelee · 发表于 2022-1-27 17:41:41

怎么就不更新了啊

Onecax · 发表于 2022-1-28 10:47:09

上面的代码有误，修改如下：
res_key.append(res_str["VatInvoiceInfos"]["Name"])---["Name"]前要加一个[i]
res_value.append(res_str["VatInvoiceInfos"]["Value"])---["Value"]前要加一个[i]

帆软用户am5ymeUaqa · 发表于 2022-3-29 11:06:27

使用效果如何？

帆薯仔 · 发表于 2024-12-2 15:15:39

哇，正是我需要的🫠

7回帖数	8关注人数	18091浏览人数
最后回复于：2024-12-2 15:16

提问

“抄”代码封装腾讯云OCR接口，结合前端事件实现发票图片和PDF两种文件格式识别