Skip to content

Commit

Permalink
hore: update tika ocr and pdf scannning (#248)
Browse files Browse the repository at this point in the history
* chore: update tika ocr and pdf scannning

This Closes #247

* chore: update from 7071 to 8070
  • Loading branch information
saidsef committed May 11, 2024
1 parent c93f0d7 commit 5af2887
Show file tree
Hide file tree
Showing 7 changed files with 9 additions and 8 deletions.
2 changes: 1 addition & 1 deletion deployment/k8s/base/server/deployment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ spec:
fieldPath: status.podIP
ports:
- name: server
containerPort: 7071
containerPort: 8070
protocol: TCP
- name: metrics
containerPort: 7072
Expand Down
2 changes: 1 addition & 1 deletion deployment/k8s/base/server/service.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ spec:
ports:
- name: server
protocol: TCP
port: 7071
port: 8070
targetPort: server
- name: metrics
protocol: TCP
Expand Down
2 changes: 1 addition & 1 deletion deployment/k8s/base/ui/deployment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ spec:
- name: HOST
value: "tika-server"
- name: HOST_PORT
value: "7071"
value: "8070"
ports:
- name: http
containerPort: 8080
Expand Down
2 changes: 1 addition & 1 deletion function/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ ENV JAI_IMAGEIO_JPEG2000 1.4.0
ENV JBIG2_IMAGEIO 3.0.4
ENV MS 180000
ENV POD_IP ${POD_IP:-127.0.0.1}
ENV PORT ${PORT:-7070}
ENV PORT ${PORT:-8070}
ENV PYTHONIOENCODING utf8
ENV SQLITE_JDBC 3.40.0.0

Expand Down
2 changes: 1 addition & 1 deletion function/Dockerfile.server
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ LABEL org.opencontainers.image.description "Apache Tika API Server"

ENV TIKA_VERSION 2.9.2
ENV CLASSPATH "/tika-server-standard-${TIKA_VERSION}.jar:/tika-extras/*:/opt/tika/libs/*"
ENV PORT ${PORT:-7071}
ENV PORT ${PORT:-8070}
ENV PORT_METRICS ${PORT_METRICS:-7072}
ENV POD_IP ${POD_IP:-127.0.0.1}
ENV PROMETHEUS_JMX_JAR_VERSION 0.20.0
Expand Down
2 changes: 1 addition & 1 deletion function/handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
from subprocess import Popen, PIPE

logging.getLogger(__name__)
logging.basicConfig(level=logging.DEBUG)
logging.basicConfig(level=logging.DEBUG, handlers=[logging.StreamHandler()])


def byte2json(b):
Expand Down
5 changes: 3 additions & 2 deletions ui/app.js
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ const uploads = multer({ storage });
const TIMEOUT = 500000; // Milliseconds
const PORT = process.env.PORT || 8080;
const HOST = process.env.HOST || 'server';
const HOST_PORT = process.env.HOST_PORT || 7071;
const HOST_PORT = process.env.HOST_PORT || 8070;
const protocol = process.env.PROTOCOL === 'https' ? https : http;

collectDefaultMetrics({ register: prometheusRegister });
Expand Down Expand Up @@ -92,7 +92,8 @@ app.post('/', uploads.single('doc'), async (req, res, next) => {
if (req.file) {
payload = req.file.buffer;
options.headers = {
'Content-Type': req.file.mimetype
'Content-Type': req.file.mimetype,
'X-Tika-PDFocrStrategy': 'auto'
}
}
const post = protocol.request(options, (response) => {
Expand Down

0 comments on commit 5af2887

Please sign in to comment.