throwawaywego | 6 years ago | on: Troubles with the AWS web console
throwawaywego's comments
throwawaywego | 6 years ago | on: You Can Now Tell Facebook to Delete Its Internal Record of Your Face
throwawaywego | 6 years ago | on: China sows disinformation on Hong Kong using porn accounts on Twitter
throwawaywego | 6 years ago | on: China sows disinformation on Hong Kong using porn accounts on Twitter
Really no better than: "Let's discuss: Employee China stole something from the communal fridge." "Sure, but what about Employee US? I judged him stealing last year. I assume it very probable that Employee US is stealing from Employee China right now. Maybe that's why Employee China was so hungry, he was forced to steal, because Employee US started it. Maybe Employee China did not even steal anything, just took the blame for an unredeemable thief. Let's discuss and pontificate about that hypothetical instead!"
throwawaywego | 6 years ago | on: DeepFaceLab: A tool that utilizes ML to replace faces in videos
What I think Zao does is preprocess the videos (manually or highly accurate facepoint detection). They pre-calculate the transforms (standard facemorphing algorithms with opacity/alpha tweaks), and shading depending on the scene lighting. Then they just need a good frontal selfie, or do some frontalization, and keypoint detection, and the rest can be rendered/computed without much resources, following a pre-defined script.
If more advanced than facemorphing, then perhaps something more like: https://github.com/facebookresearch/supervision-by-registrat... (pre-fitting a 3D face mask, then texturing it with your selfie)
throwawaywego | 6 years ago | on: China sows disinformation on Hong Kong using porn accounts on Twitter
> Claim: The US is supporting and encouraging Hong Kong protests.
> Verdict: Conspiracy theory without evidence.
> For years, pro-Kremlin media has used the narrative about anti-government protests being funded by the US. Examples include colour revolutions in post-soviet states, the “Arab Spring” revolts, and Euromaidan in 2014.
> The Hong Kong protests began in June 2019 because of a controversial extradition law that would allow for the transfer of suspects to face trial on the Chinese mainland.
https://euvsdisinfo.eu/report/the-us-is-supporting-and-encou...
throwawaywego | 6 years ago | on: NLP's Clever Hans Moment Has Arrived
It would be a copout. Instead of actually tackling AI's problem of common sense, claim that maybe layers of logistic regression and matrix factorization is all there is, we are its equal, just a few layers up in abstraction and evolution. Does one really stem and count tokens to decide if a movie review is negative sentiment? Or does one empathize with its writer and build a complete model inside your head?
The horse would be the AI researcher claiming reasoning and understanding from an activation vector trained on word co-occurence on Wikipedia, and the farmer giving clues, is the heated community and industry, mistaking impressive dataset performance for a solution for a problem they're starting to forget.
throwawaywego | 6 years ago | on: What a tweet tells us about spy satellites [video]
throwawaywego | 6 years ago | on: What a tweet tells us about spy satellites [video]
Because it is common to reconstruct a signal by taking multiple measurements, instead of a single sample. The field of compressed sensing broke ground on effective sampling. If (some) error is random noise, then you can remove this by majority vote. It is ineffective not to re-use that high-resolution secret drone fly-over footage, when composing satellite imagery at a later date. Inpainting, upscaling, de-oldifying, automatic coloring, 3D modeling, composition (see black hole photo process) etc. have become common usage in the ML community, and so I have reason to assume these techniques are also used to enhance and improve the resolution and unobserved guesstimates of satellite imagery.
> how are you going to connect the coloured blob that is the car with an image taken by a traffic cam at a different time in a country that doesn't give you access to its traffic cams?
Install road-side pointing camera's inside rented housing / contracted freedom fighter homes? Hack the traffic cams?
Didn't the NSA track mobiles in foreign countries by installing similar beacons / hacking sensors in bigger cities? That would theoretically allow them to view through the car roof and "see" who is in the backseat.
> The spy agency is said to be tracking the movements of “at least hundreds of millions of devices” in what amounts to a staggeringly powerful surveillance tool. It means the NSA can, through mobile phones, track individuals anywhere they travel – including into private homes – or retrace previously traveled journeys.
> The NSA provided some input into the report, with one senior collection manager, granted permission to speak to the newspaper, admitting the agency is “getting vast volumes” of location data from around the planet by tapping into cables that connect mobile networks globally.
> According to the Post, the NSA is applying sophisticated mathematical techniques to map cell phone owners’ relationships, overlapping their patterns of movement with thousands or millions of other users who cross their paths.
throwawaywego | 6 years ago | on: What a tweet tells us about spy satellites [video]
The US could have build test facilities in their deserts, have a 3-D model available for proper reconstruction, and then learn to stitch and skew back all imagery into a single composite image. There may even be some "filling in" or "sharpening" of pixels or textures that could not be observed, but are guessed from their context.
In the framework of composite imagery, it would indeed be possible to zoom in, until you get to camera's capturing road traffic (maybe the license plate was not observed in the moment the main photo was taken, but was remembered from an observation by traffic camera 30 minutes ago and stitched back onto the object: composite imagery through time).
Finally, you could use multiple non-image sources for the composition. If three (ground) sensors capture the noise, heat, or vibrations from a train on a train track, you can now triangulate and draw the location of that train on space photos at a timestamp of your choosing.
throwawaywego | 6 years ago | on: A New Algorithm for Controlled Randomness
Randomness is by (some) definition unpredictable. But humans are so eager for pattern recognition that they will see, or expect, patterns that just are not there.
"Pareidolia is the tendency to interpret a vague stimulus as something known to the observer, such as seeing shapes in clouds, seeing faces in inanimate objects or abstract patterns, or hearing hidden messages in music." and also: https://en.wikipedia.org/wiki/Apophenia#Causes
On a similar note, humans are terrible at coming up with random/unpredictable sequences. If you ask a group of test subjects to pick a random number between 1 and 10, you get a huge edge when you guess 3 or 7.
throwawaywego | 6 years ago | on: Exploring Weight Agnostic Neural Networks
> There are two cultures in the use of statistical modeling to reach conclusions from data. One assumes that the data are generated by a given stochastic data model. The other uses algorithmic models and treats the data mechanism as unknown. The statistical community has been committed to the almost exclusive use of data models. This commitment has led to irrelevant theory, questionable conclusions, and has kept statisticians from working on a large range of interesting current problems. Algorithmic modeling, both in theory and practice, has developed rapidly in fields outside statistics. It can be used both on large complex data sets and as a more accurate and informative alternative to data modeling on smaller data sets. If our goal as a field is to use data to solve problems, then we need to move away from exclusive dependence on data models and adopt a more diverse set of tools.
throwawaywego | 6 years ago | on: What Netflix’s ‘Great Hack’ Gets Wrong About Cambridge Analytica
throwawaywego | 6 years ago | on: What Netflix’s ‘Great Hack’ Gets Wrong About Cambridge Analytica
For factual analysis of the effects and strategies employed by Obama (on a casual glance, most of which support the statement that Obama's campaign was highly successful), do a search on Google Scholar. Here are a few highly cited political science sources I was able to pull (need to get back to work now).
> Digital media in the Obama campaigns of 2008 and 2012: Adaptation to the personalized political communication environment
> This essay provides a descriptive interpretation of the role of digital media in the campaigns of Barack Obama in 2008 and 2012 with a focus on two themes: personalized political communication and the commodification of digital media as tools. The essay covers campaign finance strategy, voter mobilization on the ground, innovation in social media, and data analytics, and why the Obama organizations were more innovative than those of his opponents. The essay provides a point of contrast for the other articles in this special issue, which describe sometimes quite different campaign practices in recent elections across Europe.
> From Networked Nominee to Networked Nation: Examining the Impact of Web 2.0 and Social Media on Political Participation and Civic Engagement in the 2008 Obama Campaign
> This article explores the uses of Web 2.0 and social media by the 2008 Obama presidential campaign and asks three primary questions: (1) What techniques allowed the Obama campaign to translate online activity to on-the-ground activism? (2) What sociotechnical factors enabled the Obama campaign to generate so many campaign contributions? (3) Did the Obama campaign facilitate the development of an ongoing social movement that will influence his administration and governance? Qualitative data were collected from social media tools used by the Obama ‘08 campaign (e.g., Obama ‘08 Web site, Twitter, Facebook, MySpace, e-mails, iPhone application, and the Change.gov site created by the Obama-Biden Transition Team) and public information. The authors find that the Obama ‘08 campaign created a nationwide virtual organization that motivated 3.1 million individual contributors and mobilized a grassroots movement of more than 5 million volunteers. Clearly, the Obama campaign utilized these tools to go beyond educating the public and raising money to mobilizing the ground game, enhancing political participation, and getting out the vote. The use of these tools also raised significant national security and privacy considerations. Finally, the Obama-Biden transition and administration utilized many of the same strategies in their attempt to transform political participation and civic engagement.
> The Internet's Role in Campaign 2008
> A majority of American adults went online in 2008 to keep informed about political developments and to get involved with the election.
throwawaywego | 6 years ago | on: What Netflix’s ‘Great Hack’ Gets Wrong About Cambridge Analytica
I'd agree that it may have been overblown (just like the Russian interference may have been overblown). Also, of course the marketeers ran with it and turned it into a sales pitch.
But that detracts just a little from the effectiveness of Obama's digital campaign. As it was the first of its kind, relative to other campaigns that lacked a modern digital strategy, it gave a significant edge. Your argument seems of the form: "Hercules is strong. Some say he is really really strong. Ergo, Hercules was not strong".
2008: > The key technological innovation that brought Barack Obama to the White House wasn’t his tweets or a smartphone app. It was the Obama campaign’s novel integration of e-mail, cell phones, and websites. The young, technology-savvy staffers didn’t just use the web to convey the candidate’s message; they also enabled supporters to connect and self-organize, pioneering the ways grassroots movements would adapt and adopt platforms in the campaign cycles to come.
> but a network of supporters who used a distributed model of phone banking to organize and get out the vote, helped raise a record-breaking $600 million, and created all manner of media clips that were viewed millions of times. It was an online movement that begot offline behavior, including producing youth voter turnout that may have supplied the margin of victory.
> All of the Obama supporters who traded their personal information for a ticket to a rally or an e-mail alert about the vice presidential choice, or opted in on Facebook or MyBarackObama can now be mass e-mailed at a cost of close to zero.
2012: > Once again, the Obama campaign built a dream team of nerds to create the software that drove many aspects of the campaign. From messaging to fund-raising to canvassing to organizing to targeting resources to key districts and media buys, the reelection effort took the political application of data science to unprecedented heights. The Obama team created sophisticated analytic models that personalized social and e-mail messaging using data generated by social-media activity.
> The Republican side, too, tried to create smarter tools, but it botched them. The Romney campaign’s “Orca,” a platform for marshaling volunteers to get out the vote on election day, suffered severe technical problems, becoming a cautionary tale of how not to manage a large IT project. For the moment, the technology gap between Democrats and Republicans remained wide.
https://www.nytimes.com/2008/11/10/business/media/10carr.htm...
https://www.technologyreview.com/s/611823/us-election-campai...
throwawaywego | 6 years ago | on: What Netflix’s ‘Great Hack’ Gets Wrong About Cambridge Analytica
You gather the data required to make a good probability prediction for voter preference ((soft) labels for this easier to find than swing voter labels). Then when the model is uncertain, those are your swing voters / on the fence voters.
> Postcode? Age? Race? Gender? Income?
When it is found to be cost-effective: All and everything that is allowed by law and then some. In its pitch deck, Facebook boasted about its advertisers being able to target and identify: university, degree, concentration, course history, class year, housing/dormitory, age, gender, sexual orientation, zip (home and university/work), relationship status, dating interests, personal interests, club membership, jobs, political bent, friend graph, site usage/addiction level.
Likes make this very easy (with a little luck, you can deduce all of zip, age, race, gender, income from a list of Likes).
> What is it about CA's methods that were so effective?
Hillary Clinton: “The real question is how did the Russians know how to target their messages so precisely to undecided voters in Wisconsin or Michigan or Pennsylvania – that is really the nub of the question. So if they were getting advice from say Cambridge Analytica, or someone else, about ‘OK here are the 12 voters in this town in Wisconsin – that’s whose Facebook pages you need to be on to send these messages’ that indeed would be very disturbing.”
FBI: Using those techniques in June 2016, “the GRU compromised the computer network of the Illinois State Board of Elections by exploiting a vulnerability in the SBOE's website,” the report said. “The GRU then gained access to a database containing information on millions of registered Illinois voters, and extracted data related to thousands of U.S. voters before the malicious activity was identified. Similarly, in November 2016, the GRU sent spearphishing emails to over 120 email accounts used by Florida county officials responsible for administering the 2016 U.S. election,” the report said. “The spearphishing emails contained an attached Word document coded with malicious software (commonly referred to as a Trojan) that permitted the GRU to access the infected computer.”
> After all someone had to do something similar for Obama.
Obama's digital campaign was very successful, but the above seems to indicate that Kushner's campaign was way more aggressive and less scrupulous (and may have had connections with - or help from foreign adversaries).
It may also be that propaganda and smears works better depending on your political preference and level of education and neurosis: Even if Hillary had spent the same amount of money and energy (some reports indicate that Hillary's digital campaign was a waste of money and displayed poor management), efficiently, it may be easier to sway a voter to vote Republican, if you can target their fears of immigrants, religious beliefs, distrust in gun regulation from the government, and conspiracy theories. Surely, the many wolf cries about fake news, and retweeting of conspiracy theories, has set up the Trump base for easier manipulation (you can simply create a meme to counter a story in a respected journal or keep them guessing on the alternative truth of it).
throwawaywego | 6 years ago | on: Hundreds of extreme self-citing scientists revealed in new database
> The rate of duplication in the rest of the biomedical literature has been estimated to be between 10% to 20% (Jefferson, 1998), though one review of the literature suggests the more conservative figure of approximately 10% (Steneck, 2000). https://ori.hhs.gov/plagiarism-13
If work by another author was enough to inspire you and add a reference, then your own previous work should certainly qualify, if it added inspiration to the current paper. Self-citing provides a "paper trail" for the reader when they want to investigate a claim or proof further.
(Like PageRank, it is very possible to discount internal PR/links under external links, and when you also take into account the authority of the referencer, you avoid scientists accumulating references from non-peer reviewed Arxiv publications).
throwawaywego | 6 years ago | on: You May Be Better Off Picking Stocks at Random, Study Finds
Amateurs tend to pick uptrending correlated stocks (all stocks trending up in the U.S.). When U.S. economy crashes, all their eggs are in the same basket.
If you'd tell the amateurs to pick the inverse, they'd go for downtrending Chinese stocks. When China's economy crashes, all their eggs are still in the same basket.
I think what you are looking for instead, is contrarian trading strategies. Here you follow counterstrategies to what the large herd is doing. A good contrarian strategy for buying bitcoin, may be to gauge crypto sentiment on HackerNews. If the majority is gleeful or pessimistic, the price is too low for future value, if articles get posted on how to build your own blockchain in Python, then you should be ready to start converting to money, because one month later, every smart nephew's uncle fomo-bought and panic-sold the hype and caused a crash or depression. Similarly, if the U.S. president is glowing about the heated economy, and dismissive of China, this opens up new profitable options for contrarian traders (which are less risky/more informed than completely random or following the herd).
throwawaywego | 6 years ago | on: I've reproduced 130 research papers about “predicting the stock market”
When you gather enough of these commonly used technical analysis, it's like having to predict in which startup Ron Conway will invest, but you can calculate Conway in a Python one-liner, and keep up-to-date by going to weekly sermon.
throwawaywego | 6 years ago | on: I've reproduced 130 research papers about “predicting the stock market”
- Create account. Enter credit card details, but verification SMS never shows up. Ask for help.
- I get called at night (I'm abroad) by an American service employee, we do verification over the phone.
- Try to get the hang of things myself. Lost in a swamp of different UI's. Names of products don't clarify what they do, so you first need to learn to speak AWS, which is akin to using a chain of 5 dictionaries to learn a single language.
- Do the tutorials. Tutorials are poorly written, in that they take you by the hand and make you do stuff you have no idea what you are actually doing (Ow, I just spun up a load balancer? What is that and how does it work?).
- Do more tutorials. Tutorials are badly outdated. Now you have a hold your hands tutorial, leading you through the swamp, but every simple step you bump your knee against an UI element or layout that does not exist in the tutorial. Makes you feel like you wasted your time, and that there is no one at AWS even aware that tutorials may need updating if one design department gets the urge to justify their spending by a redesign.
- Give up and search for recent books or video courses. Anything older than 3-4 years is outdated (either the UI's have changed, deprecated, or new products have been added).
- Receive an email in the middle of the night: You've hit 80% of your free usage plan. Log in. Click around for 20 minutes, until I find the load balancer is still up (weird, could have sworn I spun that entire tutorial down). Kill it, go back to sleep.
- Next night, new email: You've gone 3.24 $ over your free budget. Please pay. 30 minutes later: We've detected unusual activity on your account. 1 hour later: Your account has been de-activited. AWS takes fraud and non-payment very seriously.
Now I need a new phone number/name/address to create a new account. I am always anxious that AWS will charge for something that I don't want, and can't find the UI that shows all running tutorial stuff that I really don't want to pay for. I know the UI is unintuitive, non-consistent, and out-of-sync with the technical - and tutorial writers. And I know that learning AWS, consists of learning where tutorials and books are out-dated, or stumbling around until you find the correct set of sequences in a "3 minutes max." tutorial step.
AWS has grown fat and lazy. The lack of design - and onboarding consistency is typical for a company of that size. Outdated tutorials show a lack of inter-team communication, and seems to indicate that no one at AWS reruns the onboarding tutorials every month, so they can know what their customers are complaining about (or why they, like me, try to shun their mega-presence).
(EDIT: The order of my experiences may be a bit jumbled. Sorry. More constructive feedback: 1) I'd want a safe tutorial environment, with no (perceived) risk of having to pay for dummy services. 2) I want the tutorial writer to have the customer's best interest in mind: "For a smaller site, load balancing may be overkill, and can double your hosting costs for no tangible gains." beats "Hey Mark, we need more awareness and usage on the new load balancer. I need you to write a stand-alone tutorial, and add the load balancer to the sample web page tutorial." 3) Someone responsible for updating the tutorials (even if: "This step is deprecated. Please hold on for a correction") 4) A unified and consistent UI and UX. Scanning, searching, sorting, etc. should work without making me think, I don't want a different UI model for every service. Someone or some team to create the same recipes and boundaries for the different 2-pizza teams, so I don't get a pizza UI with all possible ingredients.)