News Generation

Introduction

This project is a part of the Teknofest 2024 Türkçe Doğal Dil İşleme competition. The aim of the project is to generate news title and content from a given image.

Dataset

The dataset is collected from the news website. The dataset consist of news titles, news content and images. The dataset is in Turkish Language.

Data-Preprocessing

Sample Data:

  title = "Balıkesir’de tarihi bina yangında küle döndü"
  word_index = {'Balıkesir’de': 9, 'tarihi': 5, 'bina': 3, 'yangında': 7, 'küle': 5, 'döndü': 6 }
  tokens: [start_token, 9, 5, 3, 7, 5, 6, end_token]

Input	Output
Image + start_token	9
Image + start_token + 9	5
Image + start_token + 9 + 5	3
Image + start_token + 9 + 5 + 3	7
Image + start_token + 9 + 5 + 3 + 7	5
Image + start_token + 9 + 5 + 3 + 7 + 5	6
Image + start_token + 9 + 5 + 3 + 7 + 5 + 6	end_token

Model

The model is a combination of CNN and LSTM, where the image is fed to the Encoder(CNN) and the output of the CNN is fed to the Decoder(LSTM) along with the input text.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Model		Model
Preprocess-Data		Preprocess-Data
Web Scraping		Web Scraping
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

News Generation

Introduction

Dataset

Data-Preprocessing

Model

About

Releases 1

Packages

Languages

VisionLang/YapayGazeteci-Teknofest2024-v2

Folders and files

Latest commit

History

Repository files navigation

News Generation

Introduction

Dataset

Data-Preprocessing

Model

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages