Storing the lookup map on disk as a JSON-encoded dictionary seems less than optimal for package size and module load time. Two plaintext files (M.txt and F.txt) would be simple and more efficient on disk. The text is also highly compressible -- that could further reduce package size. These things might matter if the package is used in a Serverless environment.
Also, do you think there could be value in identifying classically androgynous names?
Thanks for sharing your feedback! Great idea on using .txt instead - I'll make a change for that. (My first time sharing a package I've prepared on github, so I'm a noob with that kind of stuff)
There are names in the current json file classified as "N" which stands for non-binary, but the frequency is quite low. "N" is based on if the frequency of "M" == "F" or if the frequencies are within a certain magnitude of each other. (magnitude calculation is based on proportions testing) With that being said, maybe it'd be worth adding functionality for a user to upload their own gender_lookup file?
nic-waller|5 years ago
Also, do you think there could be value in identifying classically androgynous names?
parthmaul|5 years ago
There are names in the current json file classified as "N" which stands for non-binary, but the frequency is quite low. "N" is based on if the frequency of "M" == "F" or if the frequencies are within a certain magnitude of each other. (magnitude calculation is based on proportions testing) With that being said, maybe it'd be worth adding functionality for a user to upload their own gender_lookup file?
jk801|5 years ago
parthmaul|5 years ago