generated from entelecheia/course-template-i18n
-
Notifications
You must be signed in to change notification settings - Fork 21
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #43 from entelecheia/main
- Loading branch information
Showing
33 changed files
with
4,963 additions
and
167 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
# Week [n] Project Research Note | ||
|
||
## Basic Information | ||
|
||
- **Team Name**: [Enter team name] | ||
- **Project Name**: [Enter project name] | ||
- **Week**: Week [n] | ||
|
||
## Team Member Activity Summary | ||
|
||
| Name | Role | Key Activities | Next Week's Plan | | ||
| ------- | ------ | -------------------------------- | ------------------------ | | ||
| [Name1] | [Role] | • [Activity1] <br> • [Activity2] | • [Plan1] <br> • [Plan2] | | ||
| [Name2] | [Role] | • [Activity1] <br> • [Activity2] | • [Plan1] <br> • [Plan2] | | ||
| [Name3] | [Role] | • [Activity1] <br> • [Activity2] | • [Plan1] <br> • [Plan2] | | ||
| [Name4] | [Role] | • [Activity1] <br> • [Activity2] | • [Plan1] <br> • [Plan2] | | ||
|
||
## Weekly Goal Achievement | ||
|
||
| Goal | Status | Notes | | ||
| ------- | ------------------------------- | ------------------------ | | ||
| [Goal1] | [Completed/In Progress/Delayed] | [Additional explanation] | | ||
| [Goal2] | [Completed/In Progress/Delayed] | [Additional explanation] | | ||
| [Goal3] | [Completed/In Progress/Delayed] | [Additional explanation] | | ||
|
||
## Key Achievements and Deliverables | ||
|
||
1. [Description of key achievement or deliverable 1] | ||
2. [Description of key achievement or deliverable 2] | ||
3. [Description of key achievement or deliverable 3] | ||
|
||
## Technical Challenges and Solutions | ||
|
||
1. **Challenge 1**: [Description of the challenge] | ||
- Solution: [Description of the solution] | ||
2. **Challenge 2**: [Description of the challenge] | ||
- Solution: [Description of the solution] | ||
|
||
## Learning Outcomes | ||
|
||
1. [Learning topic 1] | ||
- Key points: [Brief explanation] | ||
- Application: [How it can be applied to the project] | ||
2. [Learning topic 2] | ||
- Key points: [Brief explanation] | ||
- Application: [How it can be applied to the project] | ||
|
||
## Next Week's Plan | ||
|
||
1. [Plan 1] | ||
2. [Plan 2] | ||
3. [Plan 3] | ||
|
||
## Other Notable Items | ||
|
||
- [Notable item 1] | ||
- [Notable item 2] | ||
|
||
## Team Meeting Summary | ||
|
||
- **Date and Time**: YYYY-MM-DD HH:MM | ||
- **Attendees**: [List of attendees] | ||
- **Key Discussion Points**: | ||
1. [Discussion point 1] | ||
2. [Discussion point 2] | ||
3. [Discussion point 3] | ||
- **Decisions Made**: | ||
1. [Decision 1] | ||
2. [Decision 2] | ||
|
||
## Attachments | ||
|
||
1. [Description and link to attachment 1] | ||
2. [Description and link to attachment 2] | ||
|
||
--- | ||
|
||
Date of Entry: YYYY-MM-DD | ||
Logged by: [Name of logger] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
# Week 2 - Basics of Text Preprocessing | ||
|
||
## Overview | ||
|
||
This week, we'll dive into the fundamental techniques of text preprocessing, a crucial step in any Natural Language Processing (NLP) pipeline. Text preprocessing is essential for cleaning and standardizing raw text data, making it suitable for further analysis and model training. | ||
|
||
## Learning Objectives | ||
|
||
By the end of this week, you will be able to: | ||
|
||
1. Understand the importance of text preprocessing in NLP tasks | ||
2. Implement and apply various tokenization techniques | ||
3. Perform text normalization, including case normalization and punctuation removal | ||
4. Identify and remove stop words from text data | ||
5. Use the NLTK (Natural Language Toolkit) library for text preprocessing tasks | ||
|
||
## Key Topics | ||
|
||
### 1. Tokenization | ||
|
||
- Definition and importance of tokenization | ||
- Word tokenization vs. sentence tokenization | ||
- Challenges in tokenization (e.g., contractions, hyphenated words) | ||
- Different tokenization approaches (rule-based, statistical, neural) | ||
|
||
### 2. Normalization | ||
|
||
- Case normalization (lowercasing/uppercasing) | ||
- Punctuation removal | ||
- Handling special characters and numbers | ||
- Spelling correction and text canonicalization | ||
|
||
### 3. Stop Word Removal | ||
|
||
- Definition and purpose of stop words | ||
- Common stop words in English | ||
- Impact of stop word removal on NLP tasks | ||
- Considerations for domain-specific stop words | ||
|
||
### 4. NLTK Library for Text Preprocessing | ||
|
||
- Introduction to NLTK | ||
- Using NLTK for tokenization | ||
- NLTK's built-in stop word lists | ||
- Additional NLTK preprocessing utilities | ||
|
||
## Practical Component | ||
|
||
In this week's practical session, you will: | ||
|
||
- Install and set up the NLTK library | ||
- Implement a text preprocessing pipeline using NLTK | ||
- Experiment with different tokenization methods | ||
- Compare the effects of various preprocessing steps on sample texts | ||
|
||
## Assignment | ||
|
||
You will be given a dataset of raw text and tasked with creating a comprehensive preprocessing pipeline. Your solution should include tokenization, normalization, and stop word removal. You'll also need to provide a brief report discussing the impact of each preprocessing step on the resulting text. | ||
|
||
## Looking Ahead | ||
|
||
The text preprocessing skills you learn this week will form the foundation for more advanced NLP tasks we'll explore in the coming weeks. Next week, we'll build upon these basics to delve into the fundamentals of language models. | ||
|
||
```{tableofcontents} | ||
``` |
Oops, something went wrong.