• Türkçe
    • English
  • English 
    • Türkçe
    • English
  • Login
View Item 
  •   DSpace Home
  • Mühendislik Fakültesi
  • Bilgisayar Mühendisliği Bölümü
  • Bilgisayar Mühendisliği Bölümü Tez Koleksiyonu
  • View Item
  •   DSpace Home
  • Mühendislik Fakültesi
  • Bilgisayar Mühendisliği Bölümü
  • Bilgisayar Mühendisliği Bölümü Tez Koleksiyonu
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

A Plagiarism Detection System Based on POS Tag N-Grams

View/Open
Thesis File (2.031Mb)
Date
2022
Author
Yalçın, Kadir
xmlui.dri2xhtml.METS-1.0.item-emb
Acik erisim
xmlui.mirage2.itemSummaryView.MetaData
Show full item record
Abstract
It is a common problem to find similar parts in two different documents or texts. Especially, a text suspected of plagiarism is likely to have similar characteristics with the source text. Plagiarism is defined as taking some or all of the writings of other people and showing them as their own, or expressing the ideas of others in different ways without citing the source. Today, it is observed that there is an increase in plagiarism cases with the development of technology. Therefore, in order to prevent plagiarism, various plagiarism detection programs have been used in universities and principles regarding plagiarism and scientific ethics have been added to education regulations. In this thesis, a novel method for detecting external plagiarism is proposed. Both syntactic and semantic similarity features were used to identify the plagiarized parts of the text. Part-of-speech (POS) tags are used to identify the plagiarized sections of suspicious texts and the original sections corresponding to these sections in the source texts. Each source sentence is indexed by a search engine according to its POS tag n-grams to access possible plagiarism candidate sentences rapidly. Suspicious sentences that converted to their POS tag n-grams are used as query to access source sentences. The search engine results returned from the queries enable to detect plagiarized parts of the suspicious document. The semantic relationship between two given words is calculated with Word2Vec, which is a method for using word embeddings. On the other hand, the longest common subsequence (LCS) algorithm is applied to calculate semantic similarity at the sentence level. In this thesis, PAN-PC-11 dataset, which was created to evaluate automated plagiarism detection algorithms, is used. The tests are carried out with different parameters and threshold values to evaluate the diversity of the results. According to the experimental results with this dataset, the proposed method achieved the best performance in low and high obfuscation plagiarism cases compared to the plagiarism detection systems in the 3rd International Plagiarism Detection Competition (PAN11).
URI
http://hdl.handle.net/11655/26924
xmlui.mirage2.itemSummaryView.Collections
  • Bilgisayar Mühendisliği Bölümü Tez Koleksiyonu [177]
xmlui.dri2xhtml.METS-1.0.item-citation
K. Yalcin, "A Plagiarism Detection System Based on POS Tag N-Grams, Doctoral Dissertation", Hacettepe University, Ankara, 2022.
Hacettepe Üniversitesi Kütüphaneleri
Açık Erişim Birimi
Beytepe Kütüphanesi | Tel: (90 - 312) 297 6585-117 || Sağlık Bilimleri Kütüphanesi | Tel: (90 - 312) 305 1067
Bizi Takip Edebilirsiniz: Facebook | Twitter | Youtube | Instagram
Web sayfası:www.library.hacettepe.edu.tr | E-posta:openaccess@hacettepe.edu.tr
Sayfanın çıktısını almak için lütfen tıklayınız.
Contact Us | Send Feedback



DSpace software copyright © 2002-2016  DuraSpace
Theme by 
Atmire NV
 

 


DSpace@Hacettepe
huk openaire onayı
by OpenAIRE

About HUAES
Open Access PolicyGuidesSubcriptionsContact

livechat

sherpa/romeo

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsTypeDepartmentPublisherLanguageRightsxmlui.ArtifactBrowser.Navigation.browse_indexFundingxmlui.ArtifactBrowser.Navigation.browse_subtypeThis CollectionBy Issue DateAuthorsTitlesSubjectsTypeDepartmentPublisherLanguageRightsxmlui.ArtifactBrowser.Navigation.browse_indexFundingxmlui.ArtifactBrowser.Navigation.browse_subtype

My Account

LoginRegister

Statistics

View Usage Statistics

DSpace software copyright © 2002-2016  DuraSpace
Theme by 
Atmire NV