COLLECTION AND PREPARATION OF CRIMINAL CONTENT DATA FROM WEB SOURCES

Authors

  • D.A. Abdramanov M. Auezov South Kazakhstan University Author
  • J.D. Iztaev M. Auezov South Kazakhstan University Author
  • С.Ж. Құракбаева M. Auezov South Kazakhstan University Author
  • I.K. Baynazarova M. Auezov South Kazakhstan University Author

DOI:

https://doi.org/10.54251/2616-6429.2025.01.11nu

Keywords:

web content, Scikit-Lear, NLTK, TensorFlow, Python, Jupyter Notebook, BeautifulSoup(BS4), XML, HTML, machine learning

Abstract

Criminal texts, such as planning crimes, inciting unlawful acts, and sharing false information, pose a threat to security in the online environment. Detecting and classifying such criminal texts is becoming an integral part of combating cybercrime. With the increasing volume of publicly available information and the rise in illegal activities on the Internet, it is necessary to develop effective methods and approaches for the automatic detection and classification of criminal texts. 

One of the approaches used in the classification of criminal texts is the application of morphological analysis methods. Morphological analysis allows for the examination of word structures, their grammatical forms, and lexical and syntactic features. However, criminal texts have distinct characteristics, which means that existing morphological analysis methods are not always effective for their classification. Therefore, the task of modifying and improving existing methods arises in order to enhance accuracy and achieve more reliable results.

Author Biographies

  • D.A. Abdramanov, M. Auezov South Kazakhstan University

    master's student

  • J.D. Iztaev, M. Auezov South Kazakhstan University

    candidate of Pedagogical Sciences

  • С.Ж. Құракбаева, M. Auezov South Kazakhstan University

    candidate of technical sciences, professor

  • I.K. Baynazarova, M. Auezov South Kazakhstan University

    master, senior lecturer

Downloads

Published

2025-03-05

Issue

Section

COMPUTER SCIENCE, INFORMATION TECHNOLOGIES

How to Cite

COLLECTION AND PREPARATION OF CRIMINAL CONTENT DATA FROM WEB SOURCES. (2025). SOUTH KAZAKHSTAN SCIENCE HERALD, 1, 65-73. https://doi.org/10.54251/2616-6429.2025.01.11nu