Skip to content
This repository has been archived by the owner on May 7, 2021. It is now read-only.

Missing acute in french in self harm2084 #2182

Closed
wants to merge 7 commits into from
2 changes: 1 addition & 1 deletion f2/src/locales/fr.json
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"analystReport.reportLanguage": "Langue du rapport",
"analystReport.reportNumber": "Signaler le numéro",
"analystReport.reportVersion": "Version du rapport",
"analystReport.selfHarmString": "mots d'automutilation détectés",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did this translation come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coming from Harm Word in title already defined and used, and PR raiser wish this the same as title

"analystReport.selfHarmString": "Mots auto-nuisibles Trouvés:",
"analystReport.selfHarmWord": "MOTS AUTO-NUISIBLES TROUVÉS:",
"anonymousPage.intro": "Si vous choisissez de signaler de façon anonyme, nous ne vous demanderons pas vos coordonnées et nous ne pourrons pas faire de suivi.",
"anonymousPage.nextButton": "Continuer",
Expand Down
37 changes: 24 additions & 13 deletions f2/src/utils/selfHarmWordsScan.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,8 @@ const { getLogger } = require('./winstonLogger')
const logger = getLogger(__filename)

const selfHarmString = process.env.SELF_HARM_WORDS || 'agilé, lean, mvp, scrum'
const selfHarmWords = selfHarmString
.split(',')
.map((w) => unidecode(w.trim().toLowerCase()))
const selfHarmWords = selfHarmString.split(',')

logger.info(`Self harm word list: ${selfHarmWords}`)

//Scan form data for self harm key words.
Expand Down Expand Up @@ -47,24 +46,36 @@ const selfHarmWordsScan = (data) => {
//Scan String for key words. Tokenize and stem to identify root words.
const scanString = (str) => {
try {
let modifiedStr = unidecode(str.toLowerCase())
let modifiedStr = str
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why declare a variable that has the same value as the param?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the original copy of data, since modifiedStr will be heavily changed or regrouped in the following code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can consolidate these 2 variables, any others?


modifiedStr = modifiedStr
.replace(/\r?\n|\r/g, ' ') //Remove newline characters
.replace(/[^\w\s']|_/g, ' ') //Remove special characters
.replace(/\s+/g, ' ') //Remove any extra sapaces

//Attempt to get root for words in String.
const formTokens = modifiedStr.tokenizeAndStem()
modifiedStr = modifiedStr + ', ' + formTokens.toString()

let wordsUsed = ''
let wordsUsedArray = []
let key_name_nl
normalizedModifiedStr = modifiedStr
.toLowerCase()
.normalize('NFD')
.replace(/[\u0300-\u036f]/g, '')
for (var key_nl in selfHarmWords) {
key_name_nl = selfHarmWords[key_nl]
.normalize('NFD')
.replace(/[\u0300-\u036f]/g, '')
.toLowerCase()
if (normalizedModifiedStr.includes(key_name_nl) && key_name_nl !== '') {
if (selfHarmWords[key_nl] !== '') {
wordsUsedArray.push(selfHarmWords[key_nl].toLowerCase())
}
}
}

//Create one String with both original and stemmed words.
modifiedStr = modifiedStr + ' ' + formTokens.toString().replace(/,/g, ' ')

//Compare text to the list of key words.
const wordsUsed = selfHarmWords.filter((w) => {
const regEx = new RegExp('\\b' + w + '\\b')
return regEx.test(modifiedStr)
})
wordsUsed = wordsUsedArray.toString()

return wordsUsed
} catch (err) {
Expand Down