word-language-detector-ai-api/README.md

# word-language-detector-ai-api

The API for the word language detector AI.

## Installation


Run following script to install the API:

```bash
LOCATION=/srv/word-language-detector-ai-api

# install python3 if it is not installed
sudo apt install python3

# install dependencies
python3 -m pip install simple-cache --index-url https://repo.jcloud-services.ddns.net/simple/ --break-system-packages
python3 -m pip install fastapi pydantic tensorflow uvicorn scikit-learn --break-system-packages

# set up directories
sudo mkdir -p $LOCATION

# download and unpack the archive
curl -fsSL --retry 3 -o word-language-detector-ai-api_latest_linux-aarch_source.tar.gz https://repo.jcloud-services.ddns.net/software/word-language-detector-ai-api/word-language-detector-ai-api_latest_linux-aarch_source.tar.gz
sudo tar -xzf word-language-detector-ai-api_latest_linux-aarch_source.tar.gz -C $LOCATION

# Optional: download the model
# You can leave this part out if you want to provide the model yourself. Notice that the directory has to include following files: 'label_encoder.json' (label encoder), 'language_detector.keras' (the actual model), 'max_len' (token maximum length) and 'tokenizer.json' (tokenizer). It is recommended to download the model as below.
curl -fsSL --retry 3 -o word-language-detector-ai_latest_tf_model_artifacts.tar.gz https://repo.jcloud-services.ddns.net/models/word-language-detector-ai/word-language-detector-ai_latest_tf_model_artifacts.tar.gz
sudo mkdir -p $LOCATION/model
sudo tar -xzf word-language-detector-ai_latest_tf_model_artifacts.tar.gz -C $LOCATION/model
```

Note: Ensure you at least approximately 700 KB of disk space is available. During the installation, approximately 850 KB of disk space must be available. Without the model, you need approximately 400 KB of disk space and approximately 500 MB of disk space during the installation.

You can remove `word-language-detector-ai-api_latest_linux-aarch_source.tar.gz` and `word-language-detector-ai_latest_tf_model_artifacts.tar.gz` after installing the software and **checking the hashes**.

### Hashes

|File|SHA256 Hash|
|-|-|
https://repo.jcloud-services.ddns.net/software/word-language-detector-ai-api/word-language-detector-ai-api_latest_linux-aarch_source.tar.gz|675ed865d7f0977bc216f94eee88f64b6bd9bd42fd28d1443f1ad22738d5b3f7|
|https://repo.jcloud-services.ddns.net/models/word-language-detector-ai/word-language-detector-ai_latest_tf_model_artifacts.tar.gz|52fdd714524b8cae3c49cc918e13a3568ddb96f97596b24ccb242c5e2f1c30d7|

If the hashes above do not match the actual file hashes, the software max be tampered with, **SO DO NOT EXECUTE THE SOFTWARE**

## Run the API

To run the API, run `src/api.py` in the package directory. Here is an example with `/srv/word-language-detector-ai-api` as the package directory
```bash
/srv/word-language-detector-ai-api/src/api.py
```

### Arguments

For arguments documentation, run the API with the flag `--help`.

## API Docs

For the full API documentation, see the docs path of the API.

## Changelog

### Version 0.2.0
- support for CORS

### Version 0.1.1
- added `--break-system-packages` to `pip install ...`
- bug fix: create model directory before downloading the model

### Version 0.1.0
- initial release
- API for the word language detector AI
- configuration
- logging
- caching
- lazy imports
- command line arguments