General Problem Solving

General problem solving (GPS) involves solving every kind of problem in a satisfactory manner. This includes steps referred to as the “problem-solving cycle.” These are steps used in order until a satisfactory solution is found. How acceptable the solution depends on personal judgment. Five of the most common processes and factors that researchers have identified… Read More »

Generating Unique Short Hashes

Ever wonder how URL shortening websites get such short hashes? Most of them simply make a random hash then check if they’ve used it before. An advantage of this system is that very short hashes can be created with an no possibility of collision — especially if a wide range of characters are used. While… Read More »

How not to save user passwords

On March 21, 2019, Facebook announced that it had exposed hundreds of millions of their users’ passwords. A bug in its password management systems caused passwords for Facebook, Facebook Lite, and Instagram to be stored as plaintext in an internal platform. As a result, thousands of Facebook employees could have potentially seen them. Krebs reports… Read More »

Tips on cleaning English text data for analysis

Here’s some advice on how to clean natural text for data analysis. These suggestions are meant for English. These are in order of how useful I think they are, not the order that you should apply them. For example, you would need to do safe reduction before deleting stop words. 1. Keep copies Keep a… Read More »

Why does PHP’s password_hash() output change each time the same password is hashed?

Nota bene: the hash() algorithm in this article has been slightly altered so that the code below doesn’t work. This is intentional: this code should not be used for secure hashing as it is merely a demonstration of why the same password can generate a different hash. The hash for a password should change each… Read More »

Decision Trees for Linguists Pt. 2 – Information Gain

Entropy and Information Gain Continued from Part 1. Let’s start with some definitions. Tree: A hierarchical structure of nodes and connections between those nodes (branches) with parent-child relationships. Child nodes have parent nodes, which in turn may have their own parents nodes. The highest node is the root node. Decision tree: a flow-chart-like structure where… Read More »