Pinbenterjamin's comments

Pinbenterjamin | 5 years ago | on: Ask HN: What skills should a new grad be learning?

Best advice I can give, as a manager, is self-discipline. It's a core tenant of what divides my good workers, from my standard workers.

Sure, learning Java, or C#, or Python is nice. But what really makes an effective worker to me is someone who has strategies for tackling problems, and I believe that the highest abstraction of that is self-discipline.

When you are tasked with a problem in the real word, whether it be enterprise, start-up, or side project, you will need to look at the problem wholly. You will need strategies for laying a path that you, and you alone can follow.

There's so much to writing software, but every developer wants to type. Do you have a good method for discovery? Understanding unfamiliar project landscapes, or code flows? Are you practicing reading code? If a manager gave you a project in a project you've never seen before, how do you know where to start?

Do you have a strategy for note taking? For flow charting as you discover? For documenting minutiae as you come across it during development? If you do, are you consistent? You need to be diligent and habitual in your process. If you present problems in a consistent manner to your manager, you will developer an easy means of communication, allowing you to solve things faster, and get what you need faster.

Are you consistent in your coding style? Are you implementing according to the problem, or according to whatever new tool you most recently discovered. That's engineering discipline.

I didn't really get there until I was in my late 20s, but it took some seriously trivial changes in my life to get started on that road. Like, I started making my bed every morning. Reading before bed, making a small breakfast, and coffee before work. Walking twice a day. Gym at lunch.

Having a schedule, being habitual, helps you get in the mind set of optimization. Optimizing your own processes will lead you to consistency, and sharpening your consistency will force you to be disciplined!

Pinbenterjamin | 6 years ago | on: Pentagon Requests $15B for Space Force

I'm going to try and be optimistic about this;

10b in research and development? I can't imagine that it's going to be for 'weapons' only right? Some of this has to go into the future of humankind. I would imagine they have to be interested in colonization, as it's generally in the interest of humankind to expand to the reaches of outerspace, and to do so and be confronted with opposition would require such an agency.

I'd like to know too what the overlap in responsibility between NASA and Spaceforce will be. Are they both scientific ventures? Is Spaceforce actually interested in intergalactic safety.

I'm cautiously optimistic.

Pinbenterjamin | 6 years ago | on: New developers, a piece of advice. Learn a text editor

It is nice to have tools that don't require 30 seconds to boot up, just so you can view some xml (Looking at you visual studio). But the recommendation of vi or emacs...just why? I scratch my head when people recommend 30 year old tech.

Those who use vi and emacs to their full potential are that generation of developers who grew up with those tools. Nowadays, why wouldn't you use something like VSCode as your main text editor? Atom? Or even n++? Unless you are confined by your job to an sshterm or something. If you want to LEARN a text editor you might as well pick one of the modern ones with all the nice modern QOL updates. The tech-flex of developing in a terminal is so dumb.

And VI? Why not VIM? VI is like recommending ANSI C over C18.

Pinbenterjamin | 6 years ago | on: Ask HN: Do I need a formal business plan for my side project?

You definitely should. I remember when I started doing serious side projects, how invaluable having a document was.

It should be living though. Start with something small, describe goals, and outlook. Put a feature checklist on it. As you work through early problems and refactors, treat it like a development blog. Include build notes.

You will probably put this down, and pick it up a few times on your journey. Make sure you leave plenty of breadcrumbs for yourself so that you can be productive each time you do.

And, should you decide to put it down for good, you'll have a nice little project history for lessons the next time you try it.

Pinbenterjamin | 6 years ago | on: The 2010s were supposed to bring the eBook revolution. It never quite came

It's all in what you want as a reader.

For some, it's comparing 'holding a physical book and turning the pages' to the more active, not taking readers.

I don't think the leaders in the ebook industry really understood what the target market was when they took on the endeavor.

For me, ebooks made a lot more sense. Using exclusively the kindle app, I can keep track of my progress across multiple books, score and rate them, make notes (which are preserved and easy to access), and share insightful quotes from my favorite books. If you're a social reader there's really nothing like it.

If you're more of the page turner type, or someone who enjoys the catharsis of reading a book, nothing is going to compare to the feeling of turning a physical page, sipping on some tea in just the right amount of lighting, etc.

I think ebooks set out to capture the first market, and they did it, but people expected it to dominate the book market, which it hasn't and won't.

Pinbenterjamin | 6 years ago | on: Ask HN: How do you learn new things?

There are impossibly diverse options for learning today.

Ideally, you'll want to find which methodology (or -ies ideally) works best for you, and continue to branch from there.

The idea of 'how do you learn' extends above answers like 'books' or 'youtube' or 'coursera', and fits better in the categorization of 'I like to Listen', 'I like to try', 'I like to watch'.

Ideally, find which of these, or which combination of these is most interesting to you, and then find the appropriate tool to leverage that.

Personally, I like to Read, and I like to Do.

I have a brainstorming meeting with my Team once every 2 months where we spitball home project ideas, and have a show and tell from work we've done in our spare time. Once I've settled on an idea, I apply the above concepts to learn as much as I can about it.

For example, the last project was to build and solve a Rubiks cube. I read about the mathematics of a Rubiks cube, and built models alongside. I know that having reference material, and an instance of CLion open next to me is my recipe for successful learning.

Pinbenterjamin | 6 years ago | on: Ask HN: What's a promising area to work on?

Process Automation / Business Rule Optimization is a nearly untouched field for large, existing software driven companies.

For goal N, where N is an amalgam of small processes, what can we do to isolate, automate, and optimize the process?

Everyone is hyper focused on generic ML solutions, but there's it has been a hell of a lot easier to hand-code solutions to problems in existing mid-to-large scale businesses that are looking to alleviate a process pain point. If you have the brains to determine some basic ROI ballparks, you can make back engineering hours fast if you can either increase the output of your business, or decrease the required people to accomplish that output.

There's some fields that touch upon this, like Dev-Ops, but this is a more strict definition of 'operations', in that I'm referring to the non-technical people.

The impact of this practice is amplified by two major factors; 1. Quantity of work 3. Difficulty of work

If you have high quantity of N, small optimizations are compounded across all the work centers. If you have highly difficult work, reducing the time it takes to 'handle' or 'create' output can be reduced.

This idea ties in to a fantastic book, 'The Goal' by Eliyahu M. Goldratt.

Being in 'hot tech' is cool, but mastering old concepts is valuable to businesses rigid to change. As the number of established companies that leverage tech for a majority of their work grows, the need for brains that are capable of process improvement with a scalpel, rather than a chainsaw.

Pinbenterjamin | 6 years ago | on: Predictably Random

Kind of unrelated, but I recently tested out a scenario for the Dotnet Environment that worked really well;

I created a 'Random' service that lives for the length of the execution of the application.

This service has an instance of Random that persists with the object, and exposes simple methods with min/max parameters.

I register the service in a unity container, and then immediately resolve it, causing the Random type inside of the random service to instantiate.

Then anywhere I want to generate a random number, I inject that service.

This works because, as long as you persist a single instance of 'Random', two calls to 'Next' or 'NextDouble' won't result in the same number.

Pinbenterjamin | 6 years ago | on: Detecting Chrome headless, the game goes on

That's the general direction I'd like to take. When we capture the inputs for the scrapers, I'd like to persist everything. Mouse jiggles, delays, idle time. I think it would definitely help advance the software.

Pinbenterjamin | 6 years ago | on: Detecting Chrome headless, the game goes on

We have an enormous legal team that communicates constantly with end points to ensure they are aware of our scraping. And as I said in another comment, we store no results other than what is already available to anyone else using the web.

We've had this division for many many years, and before my time we paid another company to do this. There's no legal issues.

Pinbenterjamin | 6 years ago | on: Detecting Chrome headless, the game goes on

According to the NDA with my company I can't reveal anything about the architecture beyond the fact that it is hosted locally on a homebuilt distributed system that randomly chooses from a pool of 120 residential IPs.

We do have human emulation routines that helped avoid most detection, and that library is decoupled in such a way that we can edit behavior down to the individual site.

Some sites are just so damn good and detecting us and I just don't get it.

Pinbenterjamin | 6 years ago | on: Detecting Chrome headless, the game goes on

Well, when that option is available, as in the case of something like SAMBA WEB MVR, we absolutely opt for that instead, and pay our dues.

Pinbenterjamin | 6 years ago | on: Detecting Chrome headless, the game goes on

I still don't agree. The whole reason this business exists is to remove the cost from all the industries that need to run background checks.

I think the extent and reason for the checks aren't apparent. So I'll give a few examples where we have high volume and I hope that will enlighten you as to the reason why there are so many players in the industry.

The highest volume checks are around the medical and teaching fields. We often run 6-month, to one year recurring checks on teachers and doctors to ensure licenses and certifications are still active. As well as necessary immunizations to work in their environments.

Do you expect a low margin industry like teaching to staff a full time employee to do nothing but run background checks? They want them done and the schools have access to the information, it's just much easier for them to pay us a few dollars an employee and get a nice report than do the legwork themselves.

Additionally, incurring the cost of access for the relevant data is a barrier for companies without a bunch of cash laying around.

We don't solicit companies with incriminating information about their employees, it's a necessary part to a safe environment.

Pinbenterjamin | 6 years ago | on: Detecting Chrome headless, the game goes on

I don't have a perspective on the ethics of easier background checks. We run employment checks, the ultimate decision of whether to hire falls to the customer ALWAYS. I've seen plenty of former criminals get hired. It's a workplace culture 'thing'.

The right to be forgotten is alive and well most of the time, 90% of our clients don't observe information further back than a few years. I feel like that is a fair assessment of someone's behavior.

Pinbenterjamin | 6 years ago | on: Detecting Chrome headless, the game goes on

There are a number of ways we do this.

First, the process of automating a source is not as simple as 'grab data, send to person that creates the case'.

We have many many layers of precaution and validation both by humans and other automated systems, that helps guarantee accuracy.

On top of this, even public records has reporting rules in the industry. There are dates, specific charges, charge types (Misdemeanor/Felony), and a battery of other rules that the information is processed through in order to ensure we do not report information that we are not allowed.

We always lean to the side of throwing a case to a human. In the circumstance that anything new, unrecognized, or even slightly off happens, we toss the case to a team that processes the information by hand. At that point, we are simply a scraper for information and we cut out the step of having a human order and retrieve results.

We do not go back 40 years. Industry standards dictate that most records older than 7 years are expunged from Employment background checks. And most of our clients don't care about more than 3 years worth, with exceptions like Murder, Federal Crimes, and some obviously heinous things.

We also run a number of other tests, outside of public records to provide full background data. We have integrations with major labs to schedule drug screens, we allow those who are having a background check run on them to fill out an application to provide reasoning and information from their point of view to allow customers to empathize with an employee.

We also have a robust dispute system. The person having a background check run on them receives the report before the client requesting it in order to review the results and dispute anything they find wrong. These cases are always handled by a human, and often involve intensive research, no cost spared, to ensure the accuracy of the report.

There's a plethora of other things I'm missing, but if you have any specific questions, I'm happy to answer.

*EDIT

To clarify, there is a lot of information in public records. It isn't unclear or ambiguous at all. Motor Vehicle and Court records are extremely in-depth and spare no detail.

Pinbenterjamin | 6 years ago | on: Detecting Chrome headless, the game goes on

I don't empathize with your viewpoint because, whether it's a web scraper, or a person, the work is exactly the same. There's no additional volume, or extra steps. We just emulate a worker.

We measure the value in FTEs, and when a researcher quits, we do not replace them if the appropriate FTEs have been reached with projects.

It's a major benefit to the business not only because we don't have to pay another employee, but we can reduce training costs, and costs incurred by mistakes. We can also adjust execution of one of these agents, which normally would require rearrangement of work instructions, and retraining.

These are public records, 90% of them do not have integrations for automated systems, and those that do, we utilize. They are typically search boxes with results. We are not circumventing any type of cost that would otherwise be incurred.

We do not log any of the results, store them locally, or maintain any of the PII with each search. If a case was searched 20 minutes ago, and comes up again, we rerun the entire thing just as a human would.

Finally, to your point about 'help me with my homework', I consider posting on the HN forums homework for this type of research. There are a diverse set of talented developers on here with esoteric experience. The fact that an article related to the work I do came up on here, I thought, was an excellent opportunity to seek advice and perspective.

Pinbenterjamin | 6 years ago | on: Detecting Chrome headless, the game goes on

I run the division at my company that builds crawlers for websites with public records. We scrape this information on-demand when a case is requested, and we handle an enormous volume of different sites (or sources as we call them). We recently passed 700 total custom scrapers.

Recently, we have seen a spike in sites that detect, and block our crawlers with some sort of Javascript we cannot identify. We use headless Chrome and selenium to build out most of our integrations, I'm starting to wonder if the science of blocking scraping is getting more popular...

I don't think what I'm doing is subversive at all, we're running background checks on people, and we can reduce business costs by eliminating error-prone researchers with smart scrapers that run all day.

I don't want to seem like the bad guy here, but what if I wanted to do the opposite of this research? Where do I start? Study the chromium source? Can anyone recommend a few papers?

Pinbenterjamin | 6 years ago | on: IBM's Acquisition of Red Hat Closes

Understandable, as IBM has always targeted the enterprise solutions market. Having control over (I think still?) the most popular enterprise linux installation is a nice add to their business.

It opens greater flexibility for their servers, as they have 'ownership' of the distro now. Possible bid for a speed benchmark as a selling point? Some of the giants out there are going to be hard to beat for that though...

I don't know that they are doing this specifically for the cloud market.

Pinbenterjamin | 7 years ago | on: Base salaries offered to software engineers in SF, NYC, and Seattle

Hey Ben! (Also) Ben here.

I work in New Jersey, as an Enterprise App Full-stack dev with 0 college experience.

I started as a developer 4 years ago @15/hr, and I recently breached the 6 figure mark. I've told my coworkers that I have a fire lit by a sense of inadequacy. I've always been behind classically trained developers, that is what keeps me pushing forward. Always playing 'catch-up'.

Now that I'm involved in my company's interview process, a few thoughts on what having a college degree does for our offers;

1. Because we hire through an agency, we already have a single layer of vetting that helps remove unqualified persons (both college-level and not), which ensure we have a decent, homogeneous pool to conduct face to face interviews with.

2. Whether we hire a candidate with or without a college background, our the offer range isn't enormous. If we have an offer in mind, (say 75k), having a college degree doesn't automatically grant you the high end of our scale. I can't think of a single instance where we cared about their degree once they were in the Face-to-face. Their performance in the interview dictates their offer, and we aren't asking questions like 'How do you implement bubble sort'. We ask some hard skill questions, sure, but we also ask just as many communication and general problem solving ones as well.

3. The range of starting pay is really small for us (think 75k mid, 70k min, 80k max). So that 3% gap makes a lot of sense.

To summarize, a degree may put you in a position to interview, but the range you get paid is largely depending on a wide array of skills, some of which are unrelated to development. Missing some of the hard-skills won't disqualify you for a position as much as missing the communication skills will (for us).

Pinbenterjamin | 7 years ago | on: Facebook to Integrate Instagram, Messenger and WhatsApp

The important bit here is that, they will remain individual applications, with a unified underlying architecture.

It is a smart move, internally, to leverage a single framework for dealing with similar services. Less training, and greater lateral agility for employees. Developing a new feature for Whatsapp, and need some extra man power? Shift some workers off one of your other messenger services. Right now that may not be possible.

I decline to believe at this point that there is another corporate reason other than that.

page 1