-
-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Building a training set of tags for r #297
Comments
Exercise: hammingCode# This is a stub function to take two strings
# and calculate the hamming distance
hamming <- function(strand1, strand2) {
if(nchar(strand1) == nchar(strand2)) {
count <- 0
s1 <- strsplit(strand1, "")[[1]]
s2 <- strsplit(strand2, "")[[1]]
for(i in seq(length(s1))){
if(s1[i] != s2[i]){
count <- count + 1
}
}
return(count)
} else {
return(NULL)
}
} Tags:
|
Exercise: hammingCode# This is a stub function to take two strings
# and calculate the hamming distance
hamming <- function(strand1, strand2) {
size <- length(strand1)
hamm <- 0
for (i in 0:size) {
if (strand1[i] != strand2[i]) {
hamm <- hamm + 1
}
}
return(hamm)
} Tags:
|
Exercise: raindropsCoderaindrops <- function(x = 0) {
# Check if is integer
if(is.integer(x)){
# Extract factors
x <- as.integer(x)
div <- seq_len(abs(x))
factors <- div[x %% div == 0L]
}
if (any(factors %in% c(3, 5, 7))) {
# Substitute numbers for words
raindrop <- ifelse(factors == 3, "Pling",
ifelse(factors == 5, "Plang",
ifelse(factors == 7, "Plong", NA)))
raindrop <- raindrop[!is.na(raindrop)]
raindrop <- paste(raindrop, collapse = "")
return(raindrop)
# if no keyword, return original value
} else {return(x)}
} Tags:
|
Exercise: grainsCodesquare <- function(n) {
sqr <- vector("numeric", length = n)
sqr[1] <- 1
for(i in seq(2, n)){
sqr[i] <- sqr[i-1]*2
}
sqr[n]
}
total <- function() {
sqr <- vector("numeric", length = 64)
sqr[1] <- 1
for(i in seq(2, 64)){
sqr[i] <- sqr[i-1]*2
}
sum(sqr)
} Tags:
|
Exercise: anagramCodeanagram <- function(subject, candidates) {
} Tags:
|
Exercise: space-ageCodespace_age <- function(seconds, planet) {
earth_orbit <- 365.25
orbital_data <- list(mercury = earth_orbit * 0.2408467,
earth = earth_orbit,
venus = earth_orbit * 0.61519726 ,
mars = earth_orbit * 8808158 ,
jupiter= earth_orbit * 11.862615 ,
saturn = earth_orbit * 29.447498 ,
uranus = earth_orbit * 84.016846 ,
neptune = earth_orbit * 164.79132)
ans <- seconds/(orbital_data[[planet]]*24*3600)
return(round(ans,2))
} Tags:
|
Exercise: sum-of-multiplesCodesum_of_multiples <- function(factors, limit) {
} Tags:
|
Exercise: word-countCodeword_count <- function(words) {
as.list(table(strsplit(words, " ")[[1]]))
} Tags:
|
Exercise: phone-numberCodeparse_phone_number <- function(number_string) {
digits <- as.vector(str_match_all(number_string, "\\d")[[1]])
n <- length(digits)
if (n < 10 | n > 11) {
NULL
}
else if (n == 11 && digits[[1]] != 1) { # area code != 1
NULL
}
else {
area_code <- digits[[n-9]]
exchange_code <- digits[[n-6]]
if (area_code < 2 | exchange_code < 2) {
NULL
}
else {
paste(digits[(n-10+1):n], sep="", collapse="")
}
}
} Tags:
|
Exercise: isogramCodeis_isogram <- function(word) {
} Tags:No tags generated |
Exercise: beer-songCodelyrics <- function(first, last) {
paste(sapply(seq(first, last, by = -1), verse), collapse = "\n")
}
verse <- function(number) {
if (number > 2) {
text <- c(paste0(number, " bottles of beer on the wall, ", number, " bottles of beer."),
paste0("Take one down and pass it around, ", number - 1, " bottles of beer on the wall.\n"))
} else if (number == 2) {
text <- c(paste0(number, " bottles of beer on the wall, ", number, " bottles of beer."),
paste0("Take one down and pass it around, ", number - 1, " bottle of beer on the wall.\n"))
} else if (number == 1) {
text <- c(paste0(number, " bottle of beer on the wall, ", number, " bottle of beer."),
paste0("Take it down and pass it around, no more bottles of beer on the wall.\n"))
} else {
text <- c("No more bottles of beer on the wall, no more bottles of beer.",
"Go to the store and buy some more, 99 bottles of beer on the wall.\n")
}
return(paste(text, collapse = "\n"))
} Tags:
|
Exercise: perfect-numbersCodelibrary(magrittr)
is_perfect <- function(n){
# catch edge cases
if (n <= 0)
stop("Only natural number can be classified here!")
if (n <= 2)
return("deficient")
# find n's factors, incl. 1 but not n (aliquots)
factor <- function(i)
if (n %% i == 0)
i
# calculate sum and classify n
lapply(1:(n/2), factor) %>%
unlist %>%
sum ->
sum
dplyr::case_when(
sum == n ~ "perfect",
sum < n ~ "deficient",
sum > n ~ "abundant"
)
} Tags:
|
Exercise: prime-factorsCodeprime_factors <- function(number) {
out <- c()
if(number > 1) {
for(n in c(2:number)){
while((number / n) %% 1 == 0) {
out <- c(out, n)
number <- number /n}
if(n == number) {break}
}
out
} else {out}
} Tags:
|
Exercise: largest-series-productCodelargest_series_product <- function(digits, span){
} Tags:No tags generated |
Exercise: pascals-triangleCodepascals_triangle <- function(n) {
} Tags:
|
Exercise: pascals-triangleCodepascalsTriangle <- function(n) {
if (n == 0) {
return(list())
}
else if (n == 1) {
return(list(1))
}
else if (n == 2) {
return(list(1, c(1,1)))
}
else if (n >= 3) {
triangle <- list(1, c(1,1))
for (x in 3:n) {
row <- rep(1, x)
for (i in 2:(x - 1)) {
row[i] = sum(triangle[[x - 1]][(i - 1):i])
}
triangle[[x]] = row
}
return(triangle)
}
else {
stop("argument n needs to be an integer")
}
} Tags:
|
Exercise: nucleotide-countCodenucleotide_count <- function(input) {
} Tags:
|
Exercise: pangramCodeis_pangram <- function(input) {
} Tags:
|
Tag editing underway... |
This is an automated comment Hello 👋 Next week we're going to start using the tagging work people are doing on these. If you've already completed the work, thank you! If you've not, but intend to this week, that's great! If you're not going to get round to doing it, and you've not yet posted a comment letting us know, could you please do so, so that we can find other people to do it. Thanks! |
I did a run through and light touch edit on tags but they're likely not very comprehensive and I'm sure there are still inconsistencies so would appreciate it if anyone else is able to check and improve these tags. A number of these were either test files or empty solution stubs, so I wasn't sure whether we want to tag those since they could be used but obviously aren't representative training data. |
Thanks for the help! We've updated the tags. |
Hello lovely maintainers 👋
We've recently added "tags" to student's solutions. These express the constructs, paradigms and techniques that a solution uses. We are going to be using these tags for lots of things including filtering, pointing a student to alternative approaches, and much more.
In order to do this, we've built out a full AST-based tagger in C#, which has allowed us to do things like detect recursion or bit shifting. We've set things up so other tracks can do the same for their languages, but its a lot of work, and we've determined that actually it may be unnecessary. Instead we think that we can use machine learning to achieve tagging with good enough results. We've fine-tuned a model that can determine the correct tags for C# from the examples with a high success rate. It's also doing reasonably well in an untrained state for other languages. We think that with only a few examples per language, we can potentially get some quite good results, and that we can then refine things further as we go.
I released a new video on the Insiders page that talks through this in more detail.
We're going to be adding a fully-fledged UI in the coming weeks that allow maintainers and mentors to tag solutions and create training sets for the neural networks, but to start with, we're hoping you would be willing to manually tag 20 solutions for this track. In this post we'll add 20 comments, each with a student's solution, and the tags our model has generated. Your mission (should you choose to accept it) is to edit the tags on each issue, removing any incorrect ones, and add any that are missing. In order to build one model that performs well across languages, it's best if you stick as closely as possible to the C# tags as you can. Those are listed here. If you want to add extra tags, that's totally fine, but please don't arbitrarily reword existing tags, even if you don't like what Erik's chosen, as it'll just make it less likely that your language gets the correct tags assigned by the neural network.
To summarise - there are two paths forward for this issue:
If you tell us you're not able/wanting to help or there's no comment added, we'll automatically crowd-source this in a week or so.
Finally, if you have questions or want to discuss things, it would be best done on the forum, so the knowledge can be shared across all maintainers in all tracks.
Thanks for your help! 💙
Note: Meta discussion on the forum
The text was updated successfully, but these errors were encountered: