AWS Textract study

Some code I've written when learning what is Textract and how to use it.

The project also contains a shell script which uses the AWS CLI v2 to perform the same task.

How to use it?

Put your invoices in the demo_data directory. Here's an example of the directory structure:

.
├── demo_data
│   ├── invoice.pdf
│   └── invoices
│       ├── other_invoice.jpg
│       └── and_one_more_invoice.png
├── readme.md
└── src
    └── main.py

Provide your AWS credentials as environment variables:

$ export AWS_ACCESS_KEY_ID=your_access_key_id          # for example "AKIAIOSFODNN7EXAMPLE"
$ export AWS_SECRET_ACCESS_KEY=your_secret_access_key  # for example "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
$ export AWS_REGION=region                             # for example "us-east-1"
$ export AWS_BUCKET=bucket_name                        # for example "my-textract-study-bucket"

Run the script:
```
$ python src/main.py
```

The report will be generated with a name like <uuid>.xlsx:

.
├── demo_data
│   ├── invoice.pdf
│   └── invoices
│       ├── other_invoice.jpg
│       └── and_one_more_invoice.png
├── output
│   └── 456af71d-f7b2-4bf8-87c7-bade21d843d4
│       ├── report.csv
│       ├── report.json
│       └── report.xlsx
├── readme.md
└── src
    └── main.py

Notes

This script uses busy waiting for Textract job results (in the retrieve_analyses function). It is not optimal. In fact, it is pretty terrible for performance. Use notifications instead.
The whole thing is just one file. Terrible for legibility but eh, it works.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
src		src
.gitignore		.gitignore
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AWS Textract study

How to use it?

Notes

About

Languages

mycielski/textract_study

Folders and files

Latest commit

History

Repository files navigation

AWS Textract study

How to use it?

Notes

About

Topics

Resources

Stars

Watchers

Forks

Languages