-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inclusions and exclusions #30
Comments
Do I understand you correctly: |
Yes exactly. For example, I often need to compare multi-page PDFs but checking page 1, 3 and 5 is enough (out of 100 pages) config.conf As for the areas, many of the pdfs have variable data surrounding the pages (e.g. barcodes, service lines etc.) these aren't necessary to compare just the content in the 'centre' so specifying one content box would be easier. Right now (since yesterday ;)) I simply add exclusion boxes for these variable elements. |
What do you mean with "rand"? |
rand being a random page. or 'rand(6)' to pick 6 random pages to compare. |
What sense does it make to compare random pages? |
well, wouldn't it be quicker to just compare a subset of pages rather than the entire document? At least in theory? |
Quicker, yes. But you are only comparing a subset, so you are loosing confidence. |
When one is dealing with hundreds of thousands of pages spread over hundreds of documents it's impracticable from a time/resources point of view to compare each and every one (outside of a dev/test environment) if a small subset is compared (and the specific pages reported in the output) then the test is, of course, reproducible and 'spot checks' could be inserted into to the production workflow without delaying the process too much. At the end of the day, it's just a feature idea and not a deal-breaker! |
Hi,
It goes without saying that the exclusions array in the HOCON file is incredibly useful.
Would it also be feasible to specify content areas that should be tested as well as ones that shouldn't?
An array of pages that should be tested would also be great. Possibly also a random selection of pages.
Don't get me wrong, I will also look into the source and see if these features are something I could contribute to the project I just thought I'd ask to see if any work has started on them.
The text was updated successfully, but these errors were encountered: