Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add PP-ChatOCRv4 and support PDF #2734

Closed
wants to merge 0 commits into from
Closed

Conversation

dyning
Copy link
Collaborator

@dyning dyning commented Dec 26, 2024

No description provided.

Copy link

paddle-bot bot commented Dec 26, 2024

Thanks for your contribution!

use_table_recognition=True,
)

# ####[TODO] 增加类别信息
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是后续再改么?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

应该不会改了

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: 其他CV模块需要同步修改。

Returns:
OCRResult: The predicted OCR result with updated dt_boxes.
"""
overall_ocr_res = next(self.general_ocr_pipeline(image_array))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里及之前用next的原因是因为batch_size为1么?


class TableRecognitionResult(CVResult, HtmlMixin, XlsxMixin):
class TableRecognitionResult(BaseCVResult, HtmlMixin, XlsxMixin):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是否应该将HtmlMixin和XlsxMixin合并为TableMixin,TableMixin提供html、xlsx相关方法。

class PP_ChatOCR_Pipeline(BasePipeline):
"""PP-ChatOCR Pipeline"""

entities = ["PP-ChatOCRv3-doc", "PP-ChatOCRv4-doc"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

后续ChatOCR迭代的话,仍然会沿用这一逻辑吗,如果后续版本有大的变更,那是否应该分开写?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

目前看大的逻辑不会变了

layout_parsing_config = config["SubPipelines"]["LayoutParser"]
self.layout_parsing_pipeline = self.create_pipeline(layout_parsing_config)
layout_parsing_config = config["SubPipelines"]["LayoutParser"]
self.layout_parsing_pipeline = self.create_pipeline(layout_parsing_config)

from .. import create_chat_bot
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是否要移到上面?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

移到上面会循环应用

all_table_text_list,
all_table_html_list,
all_table_nei_text_list,
) = all_visual_info

final_results = {}
failed_results = ["大模型调用失败", "未知", "未找到关键信息", "None", ""]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

看上面的代码,failed_results应该是dict?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

修改了

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def generate_and_merge_chat_results(self, prompt: str, key_list: list, final_results: dict, failed_results: dict)中,failed_results好像是list

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

修改了

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert (
BaseCVResult.INPUT_IMG_KEY in data
), f"`{BaseCVResult.INPUT_IMG_KEY}` is needed, but not found in `{list(data.keys())}`!"
self._input_img = data.pop("input_img", None)
Copy link
Collaborator

@TingquanGao TingquanGao Dec 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里建议先不删除吧,我后续提PR删除,因为要配合CV其他模块一起改,其他模块都是依赖self._input_img的,会导致CI挂。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants