top | item 45361268

Show HN: Dayflow – A git log for your day

480 points| jerryliu12 | 5 months ago |github.com

Hi HN! I've been building Dayflow, a macOS app that automatically tracks what you're actually working on (not just which apps you have open).

Here's what it does:

- It creates a semantic timeline of your day;

- It does it by understanding the content on your screen (with local or cloud VLMs);

- This allows you to see exactly where your time went without any manual logging.

Traditional time trackers tell you "3 hours in Chrome" which is not very helpful. Dayflow actually understands if you're reading documentation, debugging code, or scrolling HN. Instead of "Chrome: 3 hours", you get "Reviewed PR comments: 45min", "Read HN thread about Rust: 20min", "Debugged auth flow: 1.5hr".

I was an early Rewind user but rarely used the retrieval feature. I built Dayflow because I saw other interesting uses for screen data. I find that it helps me stay on track while working - I check it every few hours and make sure I’m spending my time the way I intended - if I’m not, I try to course correct.

Here’s what you need to know about privacy:

- Run 100% locally using qwen2.5-vl-3b (~4GB model)

- No cloud uploads, no account

- Full source available under MIT license (https://github.com/JerryZLiu/Dayflow)

- Optional: BYO Gemini API key for better quality (stored in Keychain, with free-tier workaround to prevent training on your data)

The tech stack is pretty simple, SwiftUI with a local sqlite DB. Uses native macOS apis for efficient screen captures. Since most people who run LLMs locally already have their tool of choice (Ollama, LLMStudio, etc.), I decided to not embed an LLM into Dayflow.

By far the biggest challenge was adapting from SOTA vision models like Gemini 2.5 Pro to small, local models. My constraints were that it had to take up <4GB of ram and have vision capabilities. I had to do a lot of evals to figure out that Qwen2.5VL-3B was the best balance of size and quality, but there was still a sizable tradeoff in quality that I had to accept. I also got creative with sampling rates and prompt chunking to deal with the 100x smaller context window. Processing a 15 minute segment takes ~32 local LLM calls vs 2 Gemini calls!

Here’s what I’m working on next:

Distillation: Using Gemini's high-quality outputs as training data to teach a local model the patterns it needs, hopefully closing the quality gap.

Custom dashboards where you can track answers to any question like "How long did I spend on HN?" or "Hours until my first deep work session of the day

I'd love to hear your thoughts, especially if you've struggled with productivity tracking or have ideas for what you'd want from a tool like this.

130 comments

order

andrewmutz|5 months ago

You should sell this to Lawyers and other professionals who bill per hour to reconstruct their billables for the day without missing anything. They would pay big money for something that recovered forgotten(unbilled) work throughout the day.

1zael|5 months ago

Devil's advocate opinion - this would also show how little per hour lawyers spend working :)

whalesalad|5 months ago

I’m a software contractor and I’ve wanted this forever. I’m prototyping something on Linux now.

MollyRealized|5 months ago

I'm a litigation legal admin - I have been for 25-30 years. I instantly brought this up to an associate, telling them, "Maybe not now, but before you retire, this'll be the norm in the industry."

She had been complaining the day before about having to reconstruct a huge bunch of little 0.1 entries involving e-mails to various individuals in cases. If it could be done automatically, through a local LLM? chef's kiss

Trust me, law is definitely where you want to land this thing.

In all honesty, I have absolutely no negotiating power or decision-making authority for my firm, but it's a big one -- if that's a direction you want to go, can't guaranty I can swing enough weight, but I probably could find you the right people to talk to, give you an introduction.

mellosouls|5 months ago

Per hour? In the UK they bill by the 6-minute! If ever anything told you something about a profession...

laurieg|5 months ago

Really nice! I currently use ActivityWatch for tracking tasks on PC.

Some things I would like to be able to do with software like this:

- Identify the 'spark' of a distraction. For example, opening my email inbox to read a specific email also shows me many unrelated emails. These can easily be the cause of a 5-15 minute distraction. This information is often actionable. I installed browser plugins to hide my youtube suggested videos and my distractions went down. I made sure to close all unused windows to avoid catching a glimpse of unrelated work.

- Identify repeated tasks, and the cadence of those tasks. Do I manually make an invoice once a week for a particular edge case? Is the process basically identical every time. Could this be automated?

- How was I feeling before, during and after a task. (This is a very broad and intentionally not well-defined question, but I think it has the most promise for improving procrastination and task initiation).

jerryliu12|5 months ago

Yep, helping people understand their distraction patterns would be an amazing feature. I find myself doing the same thing, funnily enough I also have that same Youtube extension.

rw2|5 months ago

I love the product concept but the fact this person has an almost empty github and suddenly launches an app that can easily be spyware concerns me a lot :). A lot of security concerns with password etc.

astafrig|5 months ago

Could easily dismiss those concerns by looking at the source code instead of snooping around their profile, especially if you’re on GitHub anyway :)

yewenjie|5 months ago

I would not be comfortable sending my bank info passwords and all sorts of other sensitive data that I input and see on my screen to Gemini. How much is the qualitative performance difference with a local model?

jerryliu12|5 months ago

If I had to put a grade on my own experience and evals, Gemini 2.5 pro produces A- results and qwen2.5vl is maybe like B-/C+. Obviously everything's nondetermistic, so it's hard to guarantee a level of quality.

I'm reading through papers that suggest it should be possible to get SOTA performance on local models via distillation, and that's what I'll experiment with next.

muzani|5 months ago

Google owns my email, browser, phone operating system, and a small amount of passwords. I assume that it has already stolen all my confidential data by now.

CIPHERSTONE|5 months ago

Also, if your not using an enterprise edition of gemini where your data is not used for model training, your sensitive data prompts and responses is 100% available to google.

nemo1618|5 months ago

Your passwords should never be visible on screen anyway: They go straight from a password manager into a censored input field.

LocalPCGuy|5 months ago

I haven't seen this mentioned, but I immediately thought this could be a great tool for folks with ADHD. The potential for seeing what kinds of things regularly trigger distraction (I know, everything, right?) and any patterns that exist (i.e. every time I make a git commit, I go check Hacker News and lose 15 minutes). As well as being able to review day that was captured automatically is huge. The best success I had with tracking what I did was when I used to use TimeRescue to ensure I had accurate record of hours for clients, but every attempt to use anything that requires manual entry fails very quickly (either too distracting everytime I use it, or I literally just forget to use it).

Going a step further, "real time" (given processing delay) to help stay on task when the focus has shifted to something unrelated (maybe allow the individual to define this or say yes/no to train the prompts as it goes).

Anyways, it looks great. I also liked the _idea_ of Windows Recall, so to see something like this that can be privacy first is really nice.

olex|5 months ago

Pretty nice. How does it handle multiple displays? I've set it up with local Ollama, and it seems to only record and analyze one of my two secondary displays. It would be ideal if I can select which one is used if the recording is limited to a single display, or even better if it can record and analyze the entire multi-monitor desktop surface.

edit Nvm, it seems it always records the display that is currently in focus. That is probably the better way to handle it, since it automatically solves the "ignore what's shown but not interacted with on secondary displays" problem.

LocalPCGuy|5 months ago

This was my question also, I think "even better if it can record and analyze the entire multi-monitor desktop surface" would be the best option. I don't know what the impact of that would be on both recording size and AI processing time, but just because one monitor is focused doesn't always mean what's happening on another is ignored. Some examples: an ongoing meeting or watching a video on one screen while taking notes on another; or coding on one screen and a browser/app auto-refreshing on another.

jerryliu12|5 months ago

Yep, you figured out how it works! That was the easiest solution I could come up with. I'm sure theres additional context on other screens but this was a good 90/10 solution.

thalesac|5 months ago

I see a potential issue where you're in a zoom call in one monitor and working in something else in the other (multitasking ) how to handle that ?

jappwilson|5 months ago

Similar Idea to screenpipe. That gives you more customization:

github.com/mediar-ai/screenpipe

louis030195|5 months ago

screenpipe founder here, love to see more products in this area, ideally OSS, local, no lock in, API/MCP friendly

kind of sad it's macos only, i'm mostly windows user now :)

r0bbie|5 months ago

I'd only ever consider doing it with a local model, but this looks really cool!

jerryliu12|5 months ago

Thanks! Between my friends and I, it's about a 50/50 split between local and cloud. I think it's great to be able to pick the tradeoff between quality/privacy based on your own privacy preferences.

zeroq|5 months ago

On one hand I'm super enthusiastic about your project.

This could help battle procrastination, organize your time in a long run, bill your clients more efficiently, etc. 20 years younger, hyper productive me would kill for such product.

But then I recall when I accidently suggested TimeRescue to my boss at one time, and suddenly he was skimming though everyones daily logs to see if they're spending 100% of their times in business facing apps.

When I first heard about "covid mouse mover devices" that faked activity for remote workers I thought it was a joke. Seriously.

But I'm afraid this is the dystopian future. Employers constantly looking at your screen and getting spreadsheets with your daily effort.

Overall, very disturbing product.

defgeneric|5 months ago

This was my first thought too. The last generation of activity tracking, while still dystopian, was a little different at least in that it was mainly statistical. So action-wise, it might point managers at "potential problems," but doesn't make its way into a performance review (e.g. "your mouse only traveled 81.72 screen-miles this quarter, 2 standard deviations below the mean, while you also scored the lowest on number of keystrokes with vscode as the active window..."). If a manager really wanted to summarize exactly what was done they had to spend an almost equal amount of time to watch. To some degree, this alleviates that.

jerryliu12|5 months ago

Yea, honestly I would hate if people used this to track _other_ people, especially bosses. I wanted to build something that gave people more agency to do more with their precious time, but there definitely is a fine line here.

tmychow|5 months ago

Woah this is fab; much less cognitive load than manually using a time tracker. And I'm glad that there's a local option and a "BYO key" option for privacy!

Feel like something of this shape should have existed for a while, but this is very well executed!

requilence|5 months ago

Great project! I’ve had a similar experience with Rewind and the related privacy concerns. A quick thought: if I recall correctly, Rewind performs OCR locally, so it only needs to send textual data. Since you’re focusing on macOS, you could rely on VNRecognizeTextRequest and skip the extra OCR complexity. It might also help to detect and mask sensitive information with lightweight models (e.g., BERT), especially when leveraging cloud-based AI.

jerryliu12|5 months ago

Woah didn't know about VNRecognizeTextRequest, that's super cool thanks for flagging!

ahoog42|5 months ago

if you are on a Zoom/video call, does anyone know if you would have to declare that your "recording" it? I'm thinking more from the legal perspective of wiretapping/consent laws. If you have live transcripts/subtitles does that change any legal requirement.

hx8|5 months ago

Yes, in my state it's generally illegal to take a screenshot of a zoom call without announcing you are recording the conversation. I'm not certain, but I think the issue is the storage of the 1fps video, not the AI summary.

pastapliiats|5 months ago

Who doesn't love Windows Recall

7bit|5 months ago

Im confused how much loves this finds here while Recall was rightfully critisised. ITS the Same Pictures

boomlify|5 months ago

Boomlify is a modern temporary email service that provides long-lasting disposable emails with multiple domains, smart inbox preview, and a clean centralized dashboard to manage everything in one place. It lets users instantly generate temp emails without registration, choose custom addresses, and even extend email lifespans beyond the usual short limits. Boomlify also supports Gmail-based temp emails, guest and registered user modes, and cross-device access with a privacy-first design. For developers, Boomlify offers a powerful Temp Mail API with custom domain support, smart caching, rate limits, and detailed docs with examples in multiple languages, making it easy to integrate disposable emails into apps, websites, or automation workflows.

tiernano|5 months ago

wait... isnt this pretty much what Microsoft was doing with Recall?

jerryliu12|5 months ago

Recall (and Rewind) are similar in the sense that they both use screen data, but it's designed for retrieving specific things you saw, not semantically summarizing your time. My opinion is that they're completely different feature sets.

lucfranken|5 months ago

I am currently testing the app. Maybe, for more engagement, start processing cards faster just after installation. It feels weird to have to wait 30 minutes, just show me at least a card. Like the fact that I am installing DayFlow would be a positive experience.

Compliments for the Wizard - that one works perfect at least with Gemini. One little detail: You have a Github Star button in it, that really was at a non-logical place and made me think.

jauntywundrkind|5 months ago

It's somewhat related two other recent submissions,

Replace PostgreSQL with Git for your next project for git data storing. https://news.ycombinator.com/item?id=4535144 https://devcenter.upsun.com/posts/why-you-should-replace-pos...

Consumer.today day-logging single user microsite. https://consumed.today/ https://news.ycombinator.com/item?id=45351446

Cute serendipity, rule of three. Neat project too; conceptually it sounds like an amazing ability to be able to better watch ourselves. Doing it via screenshots & AI feels like a fun sense-making adventure that actually makes a lot of sense, that can maybe try to pick through & discern what the screen is doing in a lot of different scenarios.

mrklol|5 months ago

"Records screen at 1 FPS in 15-second chunks.“

If it’s recording 15 seconds, how often are you doing that? Once every 15m as the analysis interval is 15m?

pi-err|5 months ago

Looks like it's recording all the time and analyzes 900 screenshots every 15 minutes? And it keeps records for 3 days.

So I'm not sure I buy the lightweight/low-impact claim.

novoreorx|5 months ago

Given that ScreenMemory [1], an app that automatically records screenshots, has already saved my screen at intervals, it would be great for it to also have this AI summarizing ability.

[1]: https://screenmemory.app/

ghm2199|5 months ago

I would imagine this could be one of the inputs along with a STT system as context to an LLM. Because in general we can speak faster than we can write/type and for me, specifically, after a point in the day typing creates a higher cognitive load than speaking.

1. "Create a reminder for reading this email at 5:00 pm" and this could infer what to do from the screen shot's description(plus a local MCP tool for calendar)

2. "Can you fetch that file form that project in that workspace and implement the pattern in the code on my vscode terminal?" It can lower cognitive fatigue of typing and clicking a bunch of place.

3. Take notes as I describe something on the screen. It could be for prompt composition e.g. get the link from my browser and the file on vscode and write code that does XYZ.

anyg|5 months ago

Couldn't we get a low-res version of this info by tracking the active window using a cli tool? For linux, there are several options. Not sure about Mac.

Another approach is to run OCR on 1FPS screenshots. Everything runs locally without draining the battery like an LLM would.

jerryliu12|5 months ago

You definitely could! I think it would just be harder to get good semantic understanding of what you did during a segment of time without LLMs.

rcarmo|5 months ago

This is amazing and yet I think it would need an existential angst mode to capture those days when I am doing video calls with various teams all day.

Maybe patching https://github.com/JerryZLiu/Dayflow/blob/main/Dayflow/Dayfl... to say "Describe what you seen in this computer screen in the style of Werner Herzog" would do it...

lucfranken|5 months ago

Really cool!

As already seen in the comments there are lots of desires to add more data compared to just screen input.

Could be things like:

- Apple HealthKit / watch - custom apps - Phone logs

Also you stated, and true, that there is much focus needed on improving your core feature.

It might be interesting to allow some kind of API / plugin area. So that people can expand on your core feature and add the desired parts. Might in the future expand to some kind of AppStore like feature with plugins.

That would keep your work focused and allows others to make it complete in their vision, and for others.

p_zuckerman|5 months ago

This would be helpful also for companies? Hence, ethically point of view it would violate the employee time in screen so there would arise issues with employees rights and HR?

ttoinou|5 months ago

I've been using today and it seems like it's using 1 euro of credit per hour, is this normal ? Seems a bit expensive. I'm not running the trial of Gemini anymore. Would be nice to detect when there's no mouse movements / keyboard activity and stop recording in those case. And also stop recording when a media player is at fullscreen.

philipallstar|5 months ago

Looks like 98% of my day is Hacker News. This thing must have a bug.

Right?

mustaphah|5 months ago

Nicely done.

Funny enough, I had a similar idea a few weeks back; I jotted it down in my idea sketchpad. It felt a bit ambitious for an open-source side project, and I wasn't sure if it could even work with a local LLM. I was genuinely excited about it, nonetheless.

Now that I know it's totally viable, I've got even more reasons to build a Linux version myself.

Klaster_1|5 months ago

Wow this is awesome! Wish I could try this on Windows. This is genuinely one of few time tracking solutions that piqued my interest. For now, I'll stick to manual labeling activities with my custom, simple tool: https://github.com/Klaster1/timer-5

aiven|5 months ago

my custom simple tool is a stopwatch on a phone ¯\_(ツ)_/¯

rokob|5 months ago

I think this is pretty cool but I spend most of my day on a laptop I don’t own and there’s no way I could get this on there.

sipjca|5 months ago

This is super rad. Love it being Open Source, and with the option to choose local models. You’re awesome, thanks!

atoav|5 months ago

Curious how this works with multi monitor setups, e.g. watching a viedeo while researching travel plans.

xp84|5 months ago

Very cool idea. When using the Gemini option, what kind of cost would be expected to be incurred? I'd be satisfied by knowing the approximate number of tokens one would expect to be consumed by processing an hour of these recordings, and which specific model is being used.

jerryliu12|5 months ago

Gemini 2.5 Pro is pretty expensive, mostly because videos take up a lot of tokens. It's roughly 1 million input tokens/hr, with a relatively insignificant amount of output tokens. Fortunately, Gemini has a very generous free tier, which is more than enough to cover daily usage. If you set up one paid project (and just don't consume any tokens), you can still use the free tier on a different project, and they can't train on your data.

graeme|5 months ago

Would be helpful to have a screenshot test tool alongside the api test tool. The app didn't create an application support folder yet. Possibly not enough time has passed but would be great to be able to troubleshoot sooner.

jerryliu12|5 months ago

Thanks, yeah I do need to flesh out the debugging options. In the menu bar you can click the Dayflow icon which should allow you to view the recordings folder. The sqlite db is in that folder too if you want to poke around there as well.

fsto|5 months ago

Love the idea, how you present it here and in the product. Clear, trustworthy and calming. Just installed and looking forward to try out. Half-random question: how has your way to the current UI / UX (visually and copy)?

pgcosta|5 months ago

I though about having something like this! This can be a great tool! For engineers this can be a great tool to summarize the standup update, or even to recall what did we do yesterday I'll check it out now

sawyna|5 months ago

I have installed this and configured the API key, it's been three hours nothing is happening for some reason. The app doesn't show anything. Is it because I have a multi monitor setup?

mellosouls|5 months ago

Congrats on a nice looking app that will be very useful for individuals (though potentially misused by toxic managers).

Kudos particularly for the efforts you've gone to on explaining privacy implications.

jerryliu12|5 months ago

Thanks! Wanted to build something I'd personally be comfortable using.

danielfalbo|5 months ago

It would be useful for freelancers if Dayflow automatically detected the client they were working for, to count hours spent on client, similar to what toggl.com does but automated

netnameus|5 months ago

Testing it today, because my Mac thinks I'm sharing my screen, it's suppressing notifications on phone, mac, etc. Need to figure out how to fix that.

voidUpdate|5 months ago

What makes this similar to "git log", other than it show events happening in an order? It looks more like my calendar layout than git log to me

blef|5 months ago

The onboarding flow is neat. I will give it a try over the next days. The local setup makes the computer heat a bit every 15 minutes, but it's ok

user3939382|5 months ago

It’s nice to forget everything you’re working on periodically and examine the pieces of what you’ve built and redecide what they mean if anything.

smcleod|5 months ago

Nice work, does this work with local (100% offline) models assuming you have decent hardware and are serving them up with llama.cpp or similar?

jerryliu12|5 months ago

Yep! Have tested it out on Qwen 2.5VL 3B and it works reasonably well on my 16GB Macbook Air. The only thing I will say is that I don't think it's a great idea to run local models on laptop battery, since it's quite compute intensive and drains kinda quickly. Have tested with Ollama and LMStudio, but you should be able to use any OpenAI compatible local server.

chewhongjun96|5 months ago

Is it possible to include wearables as a data sources?

i.e. apple watch for sleep, running, activity levels? it could really give a 360 view of your life

jerryliu12|5 months ago

That would be really cool, but for the foreseeable future there's still a lot of room to improve how screen data is used so I'll mostly be focused on that.

wayeq|5 months ago

I'm sure my employer would be thrilled with a background process taking screenshots every second

rememberlenny|5 months ago

Congrats. This is very well executed.

muggermuch|5 months ago

This is amazing - just the tool I needed; thank you so much!

zeeqeen|5 months ago

Great! This is what I want for long! And UI is good too.

ttoinou|5 months ago

Wow such a great idea. Will you monetize this ?

akhilnchauhan|5 months ago

this is very cool - thanks for sharing!

VadimPR|5 months ago

I need this so badly - but on Linux :)

tonyhart7|5 months ago

the fact that people already built this with open source multi modal model is astonishing

ctrlp|5 months ago

Very nice. Beautiful UX.

rasulkireev|5 months ago

this is amazing! thanks for doing this!

j1000|5 months ago

What kind of problem is this solving? Why would I install spyware on my machine? lmao

scuff3d|5 months ago

Am I the only person that sees "AI" and "screen capture" and thinks no fucking way? I switched to Linux specifically to get away from data collection, why on earth would anyone want to opt into it?

graeme|5 months ago

The app has a local only mode. That's just your computer chip/gpu running a language model locally. It works even if all outgoing connections are blocked, as far as I can tell. What's the threat you're worried about in that scenario?

syngrog66|5 months ago

wow

just wow

2025 is getting surreal online