CS470 Introduction to Artificial Intelligence TEAM P12
NN-Based Typography Incorporating semantics.pdf
Name | Student ID | Github |
---|---|---|
Doojin Baek | 20190289 | DoojinBaek |
Min Kim | 20200072 | minggg012 |
Dongwoo Moon | 20200220 | snaoyam |
Dongjae Lee | 20200445 | duncan020313 |
Hanbee Jang | 20200552 | janghanbee |
Word-As-Image for Semantic Typography (SIGGRAPH 2023)
We proposed an NN-based typography model NBTI that can visually represent letters, reflecting the meanings inherent in both concrete and formless words well. Our focus was on overcoming the limitations of the previous paper, "Word as Image," and presenting future directions. In the previous paper, the excessive deformation of characters made them unreadable, so the degree of geometric deformation was measured to prevent this. However, this approach limited the expressive capabilities of the characters. We shifted our focus to the readability of the characters. Instead of simply comparing geometric values, we employed a visual model that compared encoded vectors to evaluate how well the characters were recognized, using a metric called "Embedding Loss." Furthermore, the previous model faced challenges in visualizing shapeless words. To address this, we introduced a preprocessing step using LLM fine-tuning to transform these shapeless words into words with concrete forms. We named the module responsible for this transformation the "Concretizer." We used the GPT 3.5 model, specifically the text-davinci-003 variant, and fine-tuned it with 427 datasets. The hyperparameters used for fine-tuning were as follows. The Concretizer module transforms abstract and shapeless words like "Sweet" and "Idea" into words with clear forms like "Candy" and "Lightbulb."
Letter classifier dataset
curl http://143.248.235.11:5000/fontsdataset/dataset.zip -o ./data.zip
LLM finetuning dataset
./finetuning/finetuning.jsonl
- Clone the github repo:
git clone https://github.com/DoojinBaek/CS470_NBTI
cd CS470_NBTI
- Create a new conda environment and install the libraries:
conda env create -f word_env.yaml
conda activate word
- Install diffusers:
pip install diffusers==0.8
pip install transformers scipy ftfy accelerate
- Install diffvg:
git clone https://github.com/BachiLi/diffvg.git
cd diffvg
git submodule update --init --recursive
python setup.py install
- Execute setup bash file:
bash setup.sh
python code/main.py --experiment <experiment> --semantic_concept <concept> --optimized_letter <letter> --seed <seed> --font <font_name> --abstract <True/False> --gen_data <True/False> --use_wandb <0/1> --wandb_user <user name>
--semantic_concept
: the semantic concept to insert--optimized_letter
: one letter in the word to optimize--font
: font name, ttf file should be located in code/data/fonts/
Optional arguments:
--word
: The text to work on, default: the semantic concept--config
: Path to config file, default: code/config/base.yaml--experiment
: You can specify any experiment in the config file, default: conformal_0.5_dist_pixel_100_kernel201--log_dir
: Default: output folder--prompt_suffix
: Default: "minimal flat 2d vector. lineal color. trending on artstation"--abstract
: Whether the input semantic concept is abstract(formless) or not, default: False--gen_data
: Generates the data needed for the first learning, default: False--batch_size
: Default: 1
- Formless word: Applying our encoder and concretizer
python code/main.py --semantic_concept "FANCY" --optimized_letter "Y" --font "KaushanScript-Regular" --abstract "TRUE"
- Concrete word: Applying our encoder only
python code/main.py --semantic_concept "CAT" --optimized_letter "C" --font "Moonies" --abstract "FALSE"