First Text Analysis Python Project was my first unstructured text analysis project.
My open-source contributions were in several programming languages. The tools and languages used for each contribution are listed at the end of their descriptions.
While working on a stock screening application using Pandas, I encountered issues that only appeared with specific data values. Initially, I implemented a workaround in my code that functioned correctly but was cumbersome and difficult to maintain. However, this did not address existing code that might behave inconsistently depending on the data. To fix the issue, I contributed changes to Pandas itself as well as to pandas-datareader, which is now maintained as a standalone project.
My contributions included:
- Improvements to missing value handling, unit tests, and documentation.
- A signature-preserving decorator for compatibility with Python 2.
- API-level changes, including updates to documentation and “What’s New” notes.
- A temporary workaround for a known issue, along with active participation in related discussions.
Python, Cython, pytest, Sphinx, reStructuredText.
This program calculates projections for when hedge fund investors will receive their investments over time, with most calculations performed using Pandas.
The program reads data from an Excel file containing at least two worksheets: Liquidity Terms and Tranche Investments.
The program includes three scripts that generate reports and visualizations based on the data.
The emphasis is on the most common hedge fund withdrawal restrictions.
A more detailed description is in the HedgeFundsRedemption.md file.
This is a fork of jckantor's Python dateutil rule sets for NYSE trading days and holiday observances. The original rules are valid from the present onward. However, for backtesting or pattern recognition, there is often a need to access NYSE trading days from the past several years. The rules have been modified to provide NYSE trading days and holiday observances starting from 1986.
This Java program performs basic operations on datasets stored in CSV (comma-separated) files. It reads the dataset into a dataframe to perform various operations.
The program can be used as a library or directly from the command line. Users can define operations using a simple language when running from the command line.
The main purpose of this project is to illustrate that in Java, the absence of a comprehensive library like Pandas makes advanced data processing quite time-consuming. In many cases, you may find it more efficient to use Python and Pandas, even if it requires learning a new language.
That said, if you are a Java developer who doesn't know Python and only needs to perform relatively simple column-based dataset operations, this tool could be a practical option.
For more details, please refer to the project’s README file.
Comments
comments powered by Disqus