| Abstract |
This paper proposes an Artificial Intelligence (AI) text detection and classification model that combines Bidirectional Encoder Representations from Transformers (BERT) and Text Convolutional Neural Network (TextCNN), applicable for smart city social media monitoring. By leveraging BERT's powerful contextual semantic understanding capabilities and TextCNN's local feature extraction abilities, the model achieves efficient text detection and classification. The model first utilizes BERT to perform semantic representation of the input text, capturing rich contextual features. These features are then fed into the TextCNN model, where multiple convolutional and pooling operations extract and compress the features. Finally, a fully connected layer converts the extracted features into fixed-length vectors for precise classification prediction. Experimental results show that this hybrid model significantly outperforms traditional baseline models on two public datasets, Human ChatGPT Comparison Corpus (HC3) and sharegpt_gpt4. The hybrid model demonstrates notable improvements in key metrics such as Accuracy (ACC), Precision (PREC), Recall (REC), and F1-score (F1). For example, on the sharegpt_gpt4 dataset, the accuracy reaches 0.8490, making a significant improvement over the baseline model. This validates the effectiveness and superiority of combining BERT and TextCNN in text classification tasks. The innovation of this paper lies in the novel integration of BERT and TextCNN, harnessing the contextual semantic understanding of BERT with the local feature extraction of TextCNN, resulting in enhanced text classification performance. The use of diverse datasets such as HC3 and sharegpt_gpt4 showcases the model's robustness across various text types. Additionally, the model's application in smart city social media monitoring demonstrates its practical relevance, providing accurate and efficient text detection crucial for monitoring public sentiment and emerging issues in real-time. © 2024 IEEE. |