The Evolution of Data at Reddit

The Evolution of Data at Reddit

Those query params take me back! In a bid to understand the how Reddit posts are shared across the web, we introduced those two parameters a few years back. Those parameters helped us understand when and how things are getting shared outside of Reddit.

I don't think they're actively being used these days, and thanks to your mentioning it, I'm going to create a task to verify that and remove them.

The answers astounded me: Reddit used the free tier of Google Analytics

I remember this exact conversation in my interview, and I laughed because I thought it was a joke.

It's been really cool to transition from not be able to answer any questions to being able to answer them nightly, and now being able to answer them as-needed.

One of the most important parts of a fast and flexible data stack is that we have to ability to use the data in production systems in more robust fashions now. A well-documented example is (like you mentioned) rebuilding the view counting from a nightly, subreddit-level job to a near-realtime process that can work on each piece of content on the site

Is there really no better option than poisoning URLs with multiple query parameters? Specifically, ?st and ?sh

It's not a big deal, but these poison my bookmarks - sometimes I bookmark the same comment section multiple times, and it's nice to wipe them all out when I've finally got around to one of them. Now, technically the URLs are different due to the query parameters, so they are not identified as duplicates. At worst, can't you remove them from the URL after page load when they've been consumed?

I didn't share those for two reasons:

1) We don't want to be seen to be advertising for them

2) In some cases, we've moved off of those tools, but we don't want to give the impression that they aren't great tools; they just didn't fit the use case we had at the time.

Googlebot's Javascript random() function is deterministic

Googlebot's Javascript random() function is deterministic

Most random() functions are deterministic.

Most PRNGs are at least seeded with the number of ticks since epoch or some other source of entropy. The Google Bot's PRNG seems to be seeded with a constanct value, that's what the article is about.

I think what he means is that the random function isn't seeded properly, right

If you want your program to be deterministic why are you calling random()

Harshita Arora female coding success algorithm: 1. Steal the code 2. Pass it as your own. 3. When the truth comes out, sue/get the actual creator fired 4. Profit

Harshita Arora female coding success algorithm: 1. Steal the code 2. Pass it as your own. 3. When...

This is not programming, this is nonsense internet drama.

I feel certain that she isn't emblematic of "female coding success". You might as well get mad about briefcase kid again while we're on the topic of "teenager doesn't realize grownups have equal or greater google fu"

We can probably just call this unethical coding.

Where do you draw the line between an IP theft and a stupid feel good story? Just curious, if a project of yours was stolen by a young and rampant programmer would you see it as a positive thing being it a feel good story?

Now even YouTube serves ads with CPU-draining cryptocurrency miners

Now even YouTube serves ads with CPU-draining cryptocurrency miners

And then they will say: don't use as blockers, support us...

Fuck these guys trying to mine a dollar of crypto currency on my system. If I wanted to mine $1.97 of cryptocurrency, I'd mine that 42 cents myself.

This is not an exploit. The entire paradigm of modern digital advertising revolves around ads running third party JavaScript on your machine, mainly to track you and mine any data they can access. There is usually little or no vetting of ads, so “abusive” ads (as if tracking users without their knowledge or consent isn’t abusive enough) such as these are only taken down after at least some damage is done. And there is basically no accountability.

It’s not safe, and anyone who cares about their security, privacy, bandwidth, or power consumption should definitely be using an adblocker that blocks all third party JavaScript by default. That isn’t enough to stop everything bad, but it does get rid of a very large portion of it.

Exactly why I always use adblock. Ads are a nuisance beyond just being annoying. They slow down your browser and can compromise security.

What's common between coding interviews and the game of Snake.

What's common between coding interviews and the game of Snake.

TLDR, sliding window.

Pretty sure the fucking huge battery compared to the power of components in the phone was the main contributing factor.

I was surprised this was an article not a meme.

"When you're told you have to do interviews, but you don't actually want to hire anyone"

BSDScheme -- a Scheme interpreter implementation in D

BSDScheme -- a Scheme interpreter implementation in D

i'd rather read 100 posts about hobby scheme implementations than even one more about another JS framework

Reddit makes it too easy for people to be armchair critics.

Hey! I'm the author of BSDScheme. Feel free to AMA!

It's highly ironic that you found the post so pointless yet you bothered to comment on it.

Not everybody has a university education in computer science. And moreover, some people just have fun writing interpreters. I certainly do. I also find it interesting to read through diffrent implementations to compare diffrent approaches, syntax quirks and abstractions a language provides.

I'm glad this was posted here.

Microsoft’s Performance Contributions to Git in 2017

Microsoft’s Performance Contributions to Git in 2017

Microsoft has moved in the right direction during the last few years. It seems like they are more willing to support and contribute to other projects.

A guy I know who works at Microsoft sent a git checkout screenshot that said:

Switched to branch 'official/XXXXX' Your branch is behind 'origin/official/XXXXX' by 1121625 commits, and can be fast-forwarded.

I'm not sure how long it had been since he pulled that branch, but any branch that is behind by more than a million commits is seriously behind...

These days I re-route all my hate towards Oracle. So this is great news. Please join me in re-routing hate.

20+ people

The other blog post linked in this article states that there are ~3500 windows engineers, so yeah the total commits can go up pretty darn quick I imagine.

A Brief Totally Accurate History Of Programming Languages

A Brief Totally Accurate History Of Programming Languages

The original is so much better that it isn’t even funny

Copies someone else’s idea, modifies it slightly to make said idea worse, posts it as his own. Yup, this guy checks out, he’s a programmer.

Came here to say this. The original is absolute gold. This is meh.

"It's a syntax error to write Fortran while not wearing a blue tie"

stupid, derivative, unimaginative, and skirting the line of stealing other people's content and being inspired by it. takes the idea from something hilarious but forgets the execution and good writing

JavaScript Markdown WYSIWYG Editor

JavaScript Markdown WYSIWYG Editor

It is a nice project. TUI is an unfortunate acronym/initialism as it also stands for "Text-based user interface".

very nicely put together package here

the wysiwyg mode in addition to the markdown split-view is excellent

obviously great performance on the demonstration page

that chart extension is wild — really cool!

Amazing! I was looking recently into markdown editors and most of them didn't support tables and/or images, this is very polished and I will definitely use it (and hopefully contribute if I can). Great job!

Not aware the Markdown could make graphs.

Electron is Cancer

Electron is Cancer

Naming things that are not actual cancer as cancer is cancer. Please go back to 9gag.

Wirth's law

Wirth's law, also known as Page's law, Gates' law and May's law, is a computing adage which states that software is getting slower more rapidly than hardware becomes faster.

I dunno, I use vscode as a secondary editor after vim, mostly for debugging, as debugging from vim is a pain in the ass.

I have used it for Go, for C#, for F#, and it all worked quite well. It has always worked blazingly fast, even for large projects. Right now it uses around 1-2% of my 16GB memory with quite a large Go project open, with a few plugins enabled.

Yes, I guess you could have made it more efficient. But if you can get a lot of productivity while sacrificing a bit of efficiency, while still running fast enough for most of your users, why not? We are using garbage collected languages after all.

Also, some nitpicking:

You are not your end-users, and you if you are a developer most likely do not run average hardware.

Writing this in an article about developer tools is a bit counter-productive.

I looked at his benchmark post last year to see if I could reproduce his Atom numbers using the same test files (I'm a dev on the Atom team). I could not and asked what version of Atom he was using. I got no response.

He links to a benchmarking repro with some test files and some very similar results to what he has. That repo is using Atom 1.9.6 which is 18 months old and not representative of current Atom performance. Every release has had performance work and both memory and performance are far better than he posts including rewriting some of the core parts in C++.

I posted a comment with my much better performance numbers (from my laptop to be fair) and a suggestion that he retry Atom. His response was to mark all comments on his benchmarking post as available to medium members only.

Edit: Here are some articles on our blog since then about performance improvements; - Editor performance improvements - Performance improvements - Our text buffer rewrite in C++ - Improved responsiveness and memory usage - how we rewrote the renderer for perf - Faster start-up time with snapshots, removal of jQuery - Improving startup time - large file performance - links to our benchmarking system for tracking perf

Try one of these subthreads