Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems getting started #1

Open
lb42 opened this issue Jan 13, 2022 · 7 comments
Open

Problems getting started #1

lb42 opened this issue Jan 13, 2022 · 7 comments

Comments

@lb42
Copy link

lb42 commented Jan 13, 2022

The README says I should start by running docker compose build --build-arg ADMIN_PASS=my_pass
I assume this is a typo for docker-compose build --build-arg ADMIN_PASS=my_pass

However I don't understand where to install the file docker-compose.yml : if I just run it from the local copy I downloaded from github, I get permissions errors.

(spacyenv) lou@foxglove:~/Public/teipublisher-docker-compose$ ls -l ./docker-compose.yml 
-rwxrwxr-x 1 lou lou 773 Jan 13 17:11 ./docker-compose.yml
(spacyenv) lou@foxglove:~/Public/teipublisher-docker-compose$ sudo docker-compose build --build-arg ADMIN_PASS=xxxx
ERROR: .PermissionError: [Errno 13] Permission denied: './docker-compose.yml'

I am clearly missing something obvious!

@lb42
Copy link
Author

lb42 commented Jan 13, 2022

Ah, I got further after addinng my username to the docker group...

base) lou@foxglove:~/Public/teipublisher-docker-compose$ docker-compose build --build-arg ADMIN_PASS=xxxx
frontend uses an image, skipping
certbot uses an image, skipping
Building ner
Sending build context to Docker daemon  586.2kB
Step 1/7 : FROM python:3-slim
 ---> 58d8fd9767c5
Step 2/7 : RUN apt-get update && apt-get install -y git
 ---> Using cache
 ---> cde6ed1db696
Step 3/7 : WORKDIR /workspace
 ---> Using cache
 ---> 854d07fed8c5
Step 4/7 : RUN git clone https://github.com/eeditiones/tei-publisher-ner.git     && cd tei-publisher-ner     && pip3 install --no-cache-dir --upgrade -r requirements.txt     && python3 -m spacy download de_core_news_sm     && python3 -m spacy download en_core_web_sm
 ---> Using cache
 ---> c3116b883e6f
Step 5/7 : EXPOSE 8001
 ---> Using cache
 ---> b6293c78d3e0
Step 6/7 : WORKDIR /workspace/tei-publisher-ner
 ---> Using cache
 ---> 1e115668147e
Step 7/7 : CMD [ "python3", "-m", "spacy", "project", "run", "serve" ]
 ---> Using cache
 ---> e6fdea2a9b54
[Warning] One or more build-args [ADMIN_PASS] were not consumed
Successfully built e6fdea2a9b54
Successfully tagged teipublisher-docker-compose_ner:latest
Building publisher
Sending build context to Docker daemon  86.31MB
Error response from daemon: Dockerfile parse error line 85: ARG requires exactly one argument
ERROR: Service 'publisher' failed to build : Build failed
(base) lou@foxglove:~/Public/teipublisher-docker-compose$ 

@joewiz
Copy link
Member

joewiz commented Jan 13, 2022

It's not a typo! ;) See https://docs.docker.com/compose/cli-command/:

The new Compose V2, which supports the compose command as part of the Docker CLI, is now available.

Compose V2 integrates compose functions into the Docker platform, continuing to support most of the previous docker-compose features and flags. You can test the Compose V2 by simply replacing the dash (-) with a space, and by running docker compose, instead of docker-compose.

Starting with Docker Desktop 3.4.0, you can run Compose V2 commands without modifying your invocations, by enabling the drop-in replacement of the previous docker-compose with the new command.

@lb42
Copy link
Author

lb42 commented Jan 14, 2022

OK, it's not a typo. I installed the V2 docker compose (I think:

docker compose version
Docker Compose version v2.2.2
(base) lou@foxglove:~$ docker --version
Docker version 20.10.7, build 20.10.7-0ubuntu5~20.04.2

neither of which says anything about a "docker desktop" version but still)

However I am still getting the same result:

(base) lou@foxglove:~$ cd ~/Public/teipublisher-docker-compose/
(base) lou@foxglove:~/Public/teipublisher-docker-compose$ docker compose build --build-arg ADMIN_PASS=mudshark
Sending build context to Docker daemon  424.7kB
Step 1/7 : FROM python:3-slim
 ---> 58d8fd9767c5
Step 2/7 : RUN apt-get update && apt-get install -y git
 ---> Using cache
 ---> cde6ed1db696
Step 3/7 : WORKDIR /workspace
 ---> Using cache
 ---> 854d07fed8c5
Step 4/7 : RUN git clone https://github.com/eeditiones/tei-publisher-ner.git     && cd tei-publisher-ner     && pip3 install --no-cache-dir --upgrade -r requirements.txt     && python3 -m spacy download de_core_news_sm     && python3 -m spacy download en_core_web_sm
 ---> Using cache
 ---> c3116b883e6f
Step 5/7 : EXPOSE 8001
 ---> Using cache
 ---> b6293c78d3e0
Step 6/7 : WORKDIR /workspace/tei-publisher-ner
 ---> Using cache
 ---> 1e115668147e
Step 7/7 : CMD [ "python3", "-m", "spacy", "project", "run", "serve" ]
 ---> Using cache
 ---> e6fdea2a9b54
[Warning] One or more build-args [ADMIN_PASS] were not consumed
Successfully built e6fdea2a9b54
Successfully tagged teipublisher-docker-compose_ner:latest
Sending build context to Docker daemon  66.27MB
1 error occurred:
	* Error response from daemon: Dockerfile parse error line 85: ARG requires exactly one argument

@wolfgangmm
Copy link
Member

I can't see tei-publisher-app being built in the output you provided. It looks like it's building only the tei-publisher-ner docker image, not the other 3 defined by the docker-compose.yml. Could you try running the build again with --no-cache:

docker compose build --build-arg ADMIN_PASS=my_pass --no-cache

I also corrected the default value for "ADMIN_PASS" in the tei-publisher-app Dockerfile. Maybe this helps as well. Below is the output I see after running above command, so there's definitely a lot more going on:

➜  teipublisher-docker-compose git:(master) ✗ docker compose build  --build-arg ADMIN_PASS=my_pass --no-cache
[+] Building 129.0s (34/34) FINISHED
 => CACHED [teipublisher-docker-compose_ner internal] load git source https://github.com/eeditiones/tei-publisher-ner.gi  0.0s
 => [teipublisher-docker-compose_publisher internal] load git source https://github.com/eeditiones/tei-publisher-app.git  0.9s
 => [teipublisher-docker-compose_ner internal] load metadata for docker.io/library/python:3-slim                          1.6s
 => [auth] library/python:pull token for registry-1.docker.io                                                             0.0s
 => [teipublisher-docker-compose_publisher internal] load metadata for docker.io/library/eclipse-temurin:11-jre-alpine    0.7s
 => [teipublisher-docker-compose_publisher internal] load metadata for docker.io/library/openjdk:8-jdk-slim               0.7s
 => CACHED [teipublisher-docker-compose_ner 1/5] FROM docker.io/library/python:3-slim@sha256:dd3016f846b8f88d8f6c28b43f1  0.0s
 => [teipublisher-docker-compose_ner 2/5] RUN apt-get update && apt-get install -y git                                   11.7s
 => [teipublisher-docker-compose_publisher builder 1/4] FROM docker.io/library/openjdk:8-jdk-slim@sha256:25efb6e0609b95a  0.0s
 => CACHED [teipublisher-docker-compose_publisher stage-2  1/10] FROM docker.io/library/eclipse-temurin:11-jre-alpine@sh  0.0s
 => CACHED [teipublisher-docker-compose_publisher builder 2/4] WORKDIR /tmp                                               0.0s
 => [teipublisher-docker-compose_publisher builder 3/4] RUN apt-get update && apt-get install -y     git     curl        10.7s
 => [teipublisher-docker-compose_publisher stage-2  2/10] RUN apk add curl                                                1.6s
 => [teipublisher-docker-compose_publisher stage-2  3/10] RUN curl -L -o /tmp/exist-distribution-5.3.1-unix.tar.bz2 htt  44.6s
 => [teipublisher-docker-compose_publisher builder 4/4] RUN curl -L -o apache-ant-1.10.11-bin.tar.gz http://www.apache.o  2.1s
 => [teipublisher-docker-compose_ner 3/5] WORKDIR /workspace                                                              0.0s
 => [teipublisher-docker-compose_ner 4/5] RUN git clone https://github.com/eeditiones/tei-publisher-ner.git     && cd t  27.9s
 => [teipublisher-docker-compose_publisher tei 1/8] RUN  mkdir -p ~/.ssh && ssh-keyscan -t rsa github.com >> ~/.ssh/know  0.8s
 => [teipublisher-docker-compose_publisher tei 2/8] RUN  git clone https://github.com/eeditiones/shakespeare.git     &&   3.7s
 => [teipublisher-docker-compose_publisher tei 3/8] RUN  git clone https://github.com/eeditiones/vangogh.git     && cd v  8.5s
 => [teipublisher-docker-compose_publisher tei 4/8] RUN  git clone https://github.com/eeditiones/tei-publisher-app.git   28.7s
 => [teipublisher-docker-compose_ner 5/5] WORKDIR /workspace/tei-publisher-ner                                            0.0s
 => [teipublisher-docker-compose_publisher] exporting to image                                                            3.4s
 => => exporting layers                                                                                                   3.4s
 => => writing image sha256:a80423ec99aa68913998e0b765c0d5e5769b6fe3036a86d1f077ced6a955bc07                              0.0s
 => => naming to docker.io/library/teipublisher-docker-compose_ner                                                        0.0s
 => => writing image sha256:71064c35627ac651acae24132cefed22d77f6c427b228093eb5da521bbcdf08d                              0.0s
 => => naming to docker.io/library/teipublisher-docker-compose_publisher                                                  0.0s
 => [teipublisher-docker-compose_publisher tei 5/8] RUN curl -L -o /tmp/oas-router-0.5.1.xar http://exist-db.org/exist/a  0.6s
 => [teipublisher-docker-compose_publisher tei 6/8] RUN curl -L -o /tmp/tei-publisher-lib-2.10.0.xar http://exist-db.org  0.5s
 => [teipublisher-docker-compose_publisher tei 7/8] RUN curl -L -o /tmp/templating-1.0.2.xar http://exist-db.org/exist/a  0.4s
 => [teipublisher-docker-compose_publisher tei 8/8] RUN curl -L -o /tmp/shared-resources-0.9.1.xar http://exist-db.org/e  0.9s
 => [teipublisher-docker-compose_publisher stage-2  4/10] COPY --from=tei /tmp/tei-publisher-app/build/*.xar /exist/auto  0.1s
 => [teipublisher-docker-compose_publisher stage-2  5/10] COPY --from=tei /tmp/shakespeare/build/*.xar /exist/autodeploy  0.0s
 => [teipublisher-docker-compose_publisher stage-2  6/10] COPY --from=tei /tmp/vangogh/build/*.xar /exist/autodeploy/     0.0s
 => [teipublisher-docker-compose_publisher stage-2  7/10] COPY --from=tei /tmp/*.xar /exist/autodeploy/                   0.0s
 => [teipublisher-docker-compose_publisher stage-2  8/10] WORKDIR /exist                                                  0.0s
 => [teipublisher-docker-compose_publisher stage-2  9/10] RUN bin/client.sh -l --no-gui --xpath "system:get-version()"   59.2s
 => [teipublisher-docker-compose_publisher stage-2 10/10] RUN if [ "my_pass" != "none" ]; then bin/client.sh -l --no-gui  7.1

@lb42
Copy link
Author

lb42 commented Jan 14, 2022

OK, many thanks for the suggestion. Doing that I seem to have reinstalled everything, and now the run seems to terminate OK

....
Step 5/7 : EXPOSE 8001
 ---> Running in 0e7e551ed4b5
 ---> 64b402220f22
Step 6/7 : WORKDIR /workspace/tei-publisher-ner
 ---> Running in 30cca5fe960b
 ---> ccf24e853af1
Step 7/7 : CMD [ "python3", "-m", "spacy", "project", "run", "serve" ]
 ---> Running in 2a07fb5fa680
 ---> 8626fca0dbf8
[Warning] One or more build-args [ADMIN_PASS] were not consumed
Successfully built 8626fca0dbf8
Successfully tagged teipublisher-docker-compose_ner:latest

though I really have no idea what I am doing -- as you can probably tell!

@lb42
Copy link
Author

lb42 commented Jan 14, 2022

OK, I think this is working now! Thanks for your patience. One other thing I needed to do which the doc didn't explain was to stop the apache service already running on my machine.
It seems VERY slow -- what size document are you expecting it to work with?

@wolfgangmm
Copy link
Member

For me entity recognition on a 5kb document takes around 5 seconds. I did not attempt to optimize this yet though. Running entity recognition via docker is probably limiting the performance of the underlying NLP engine and the preprocessing step which extracts data from the TEI could easily be improved as well. Running via docker on the server, I seem to hit the wall processing a 100k document, though this is ok if I start entity recognition as a native service.

I started the feature as a fun project without budget. It works pretty well given the few evenings I invested, but a bit more work and testing will be needed to make it scale and turn it into a generally usable feature. We're currently thinking about putting together another joint funding proposal via e-editiones. Interested parties are welcome to join.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants