You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
text 의 style transfer 는 명확한 evaluation method 가 존재하지 않아서, bottle neck 이 생김
style transfer intensity(스타일 변화의 효과), content preservation(동일성), and naturalness(자연스러움) 에 대해 명확하게 평가할 수 있는 metric 을 찾고자 노력하였음
가장 최적의 evaluation method 를 찾기 위해서 Yelp sentiment dataset 를 사용한 실험을 진행함
결과적으로 human-evaluation 과 correlation 이 높은 아래 3개의 evaluation method 를 새롭게 제안함
direction-corrected Earth Mover's Distance
Word Mover's Distance on style-masked texts
adversarial classification for the respective aspects
Abstract (요약) 🕵🏻♂️
Research in the area of style transfer for text is currently bottlenecked by a lack of standard evaluation practices. This paper aims to alleviate this issue by experimentally identifying best practices with a Yelp sentiment dataset. We specify three aspects of interest (style transfer intensity, content preservation, and naturalness) and show how to obtain more reliable measures of them from human evaluation than in previous work. We propose a set of metrics for automated evaluation and demonstrate that they are more strongly correlated and in agreement with human judgment: direction-corrected Earth Mover's Distance, Word Mover's Distance on style-masked texts, and adversarial classification for the respective aspects. We also show that the three examined models exhibit tradeoffs between aspects of interest, demonstrating the importance of evaluating style transfer models at specific points of their tradeoff plots. We release software with our evaluation metrics to facilitate research.
이 논문을 읽어서 무엇을 배울 수 있는지 알려주세요! 🤔
style transfer 가 잘 되었는지 평가할 수 있는 evaluation method 에 대해 알 수 있음
style transfer 논문을 쓸 때, 보다 명확한 metric 을 제시할 수 있음
human evaluation과 correlation 이 높은 metric 을 찾기 위한 insight 를 얻을 수 있음
어떤 내용의 논문인가요? 👋
결과적으로 human-evaluation 과 correlation 이 높은 아래 3개의 evaluation method 를 새롭게 제안함
Abstract (요약) 🕵🏻♂️
Research in the area of style transfer for text is currently bottlenecked by a lack of standard evaluation practices. This paper aims to alleviate this issue by experimentally identifying best practices with a Yelp sentiment dataset. We specify three aspects of interest (style transfer intensity, content preservation, and naturalness) and show how to obtain more reliable measures of them from human evaluation than in previous work. We propose a set of metrics for automated evaluation and demonstrate that they are more strongly correlated and in agreement with human judgment: direction-corrected Earth Mover's Distance, Word Mover's Distance on style-masked texts, and adversarial classification for the respective aspects. We also show that the three examined models exhibit tradeoffs between aspects of interest, demonstrating the importance of evaluating style transfer models at specific points of their tradeoff plots. We release software with our evaluation metrics to facilitate research.
이 논문을 읽어서 무엇을 배울 수 있는지 알려주세요! 🤔
레퍼런스의 URL을 알려주세요! 🔗
https://arxiv.org/abs/1904.02295
The text was updated successfully, but these errors were encountered: