CEPOL Research & Science Conference 2022 MRU, Vilnius

Georgios Lygeros

I am 40 years old and I was born in Patras in Western Greece. I΄m a Police Officer and I΄m currently in charge of the Regional Department which involving in trafficking in human beings-sexual exploitation. At the same time, through my role as Regional Coordinator for internet-based crime, I work in close collaboration with the Central Division of Cyber Crime, focusing on related cases through internet. Finally, I am also a certified member of the Crises Negotiation Team of the Hellenic Police, responsible for the region of Western Greece. I have a master's degree in communication and information systems from the University of the Aegean and a master's degree in crisis management from the Kapodistrian University of Athens


Sessions

06-09
09:00
20min
Identification of invalid information about the Covid-19 coronavirus pandemic on a social networking platform
Georgios Lygeros

From the first moment of the spread of the COVID-19 coronavirus pandemic, another pandemic broke out at the same time, that of misinformation. Social media plays a key role in this, through which any information is rapidly transmitted around the world.The consequences of transmitting invalid information can be worse than the consequences of the pandemic itself. Conspiracy theories and false therapies are just two of the common categories that have unintended consequences for public health. On the other hand, the volume of information circulated on a daily basis makes checking the reliability of information a particularly demanding challenge for law enforcement in the Digital Age.
In my Thesis, the automatic detection of fake news related to the evolving coronavirus pandemic on social networks and specifically on Twitter is studied.For this purpose, algorithms of natural language processing (NLP) and Machine Learning are utilized. The data used to train the algorithms originates from a publicly accessible dataset that contains tweets related to the current pandemic. From the dataset, only the content concerning the Greek language was isolated.These tweets were classified and annotated in three categories, true, irrelevant, or false. Once a sufficient number of data has been annotated, the most common words are visualized through wordclouds for each category. In addition, a set of linguistic and morphological features are extracted from them by applying methods of converting texts into vectors, as well as features related to the subjectivity of the tweets’ texts.Additional features are calculated using the TF-IDF method which are used in conjunction with the morphological features. Python libraries such as NLTK, spaCy and Scikit-Learn are used to calculate these features.Before these features are feeded into the learning algorithms, PCA is applied to reduce their dimensions. Three learning algorithms are trained, Random Forest, SVM and Multinomial Naive Bayes, of which Random Forest has the most encouraging results.Our results prove that it is possible to automatically detect invalid information in posts on Twitter despite the peculiarities that characterize the Greek language.

• Challenges of Artificial Intelligence for policing and law enforcement in the Digital Age
Lecture Room 2 - I-416