Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Paper Title Number 4

Published in GitHub Journal of Bugs, 2024

This paper is about fixing template issue #693.

Recommended citation: Your Name, You. (2024). "Paper Title Number 3." GitHub Journal of Bugs. 1(3).
Download Paper

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

publications

talks

CFE-CM Statistics Conference

Published:

It is well-known in the literature that the main limitations of document clustering techniques are that they operate in a high-dimensional space and it is difficult to interpret the different clusters once a partition has obtained. The proposed methods for computing document clustering employs a two-stage process. Initially, it can be observed that the information contained within the Document-Term matrix exhibits significant sparsity, so a direct application of a clustering technique would be highly inefficient. Consequently, dimensionality reduction is applied. The proposed strategy involves employing Latent Dirichlet Allocation (LDA) to identify the main topics in the corpus under analysis. To determine the similarity between two documents, the p-value of a hypothesis test of the homogeneity of topic distributions between two documents is computed. This p-value is used as a similarity measure, upon which three different clustering procedures are built. The first two directly employs the new dissimilarity using an hierarchical approach and a fuzzy relational clustering approach while the other is a test-based approach to clustering. The performance of the clustering methods is then assessed using some benchmark datasets in order to understand advantages and disadvantages of the proposals.

Violence Against Women (VAW) workshop 2024

Published:

I had the privilege of assisting in the organization of this workshop, which addresses a truly important topic. This event provided an opportunity to explore the issue from various perspectives, including social, psychological, and legal viewpoints, as well as from a modeling standpoint, with the aim of studying the over/under-representation of the phenomenon within official statistics.

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.