MULTILINGUAL NLP ANALYSIS OF THE TERM BANLIEUE

Topic Modeling Analysis Revealing the Ukrainian War
Project description
The project, focusing on the socio-linguistic analysis of the term "banlieue" across multiple languages, integrates a variety of scripts and coding techniques to collect, process, and analyze data effectively. This project aims to create a comparable multilingual corpus to study the varied usage of banlieue ("suburb" in english) in different socio-linguistic contexts, covering French, Italian, American English, and Modern Greek.
In the data collection phase, shell scripts automate the retrieval of relevant articles from Google News using keywords like banlieue and police in different languages. These scripts process a list of URLs, fetching the content, and storing it in a structured manner. Each URL is checked for its HTTP status code and encoding type. The content is then extracted, converted to text, and analyzed for occurrences of the term banlieue and its equivalents in other languages.
Once the data is collected, the scripts organize it into HTML tables, making it easier to view and compare. These tables include details such as the HTTP code, encoding, titles, and URLs of the articles, as well as the number of occurrences of the target term and its context within the text. Additional scripts generate visual representations, like word clouds, to highlight the most common terms associated with banlieue in each language. For deeper analysis, a concordance table is created, displaying the context in which the key terms appear.
The project also involves developing a website to present the results. The website includes various pages detailing the project steps, scripts used, data tables, and analytical results. HTML and CSS are employed to build the website structure, while Bootstrap enhances the design and responsiveness. The integration of articles and analysis results into the site allows for an accessible presentation of the findings. Visit the project site HERE to see the results of our topic modeling investigation.
Discover more about this project and click on the button below to access the GitHub Repository.
Explore More Projects
If you're interested in exploring more projects, please select another project from the dropdown menu.