Show HN: I built a tech news aggregator that works the way my brain does

(deadstack.net)

177 points | by dreadsword 1 day ago

40 comments

taftster 1 day ago
I'm not sure, something about the "Recent Stories Summary" section (first view) is hard to read. The spacing is wrong. And the blue font. Someone mentioned Garamond too.
It's creating a "wall of text" effect to me and I'm not able to quickly skim and allow my eye to catch the bits that are interesting to me.
As a comparison, the HN homepage is very accessible to me for skimming and finding things to click into (like this entry).
UI is often quite subjective, understood. But I can't really "scan" the first view fast enough. It's all blending together and causes extra processing on my mind.
[-]
- dreadsword 1 day ago
  I hear you - there's something to be done there. My initial thought was to stay as close to convention as I could (links are blue!), but as the RECENT list gets long, its definitely gets less scannable.
  Thank you for the feedback!
  [-]
  - taftster 1 day ago
    I hear you about "links are blue" ... except when you are a link aggregator.
    The links are blue design from early HTML was meant to highlight links in the context of a paragraph of prose, not a list of link items. "Blue" means something special about the text in the context of the text around it.
    In this case, the blue font is distracting because the links are the content. You don't need the blue to help your links "stand out". Because the links are normal text, using a normal palette would be appropriate.
    I don't mind some subtle clues that these are links. Underlines, slight grey text. Or even a subtle hover effect. Two cents.
  - gneuron 6 hours ago
    Put the number of occurrences (I assume that's the # at the end) first to help with signal to noise ratio based on quantity of coverage maybe? I'd also take a look at news minimalist if I were you, and how they used significance scoring as a fill in vs upvotes to provide additional signal: https://www.newsminimalist.com/
    It's quite scannable, but obviously you're doing reverse chrono order so up to you how best to solve the UI issue.
- dreadsword 1 day ago
  How about now? Story titles are still clickable links, but are black. Made the story count a link as well, and kept it blue as a visual cue.
  [-]
  - felideon 1 day ago
    Some additional feedback:
    There's no reason for both the story count and the story summary to be clickable. It's confusing because:
    (a) It's not clear what the number in parentheses even means (until you click and infer)
    (b) Separate links makes you think they lead to different pages
    Also, echoing another comment, it's not really clear what "incoming" and "outgoing" stories mean. Maybe "new" vs. "stale"?
  - taftster 1 day ago
    Better, to be honest. Keep refining of course. But this is definitely more readable.
    I admit that straight black is not quite the right answer either. A slightly toned down dark grey would be nice. And again, subjectively, I like how HN has a row of non-link smaller (lighter shaded) text under each listing, which I think plays nice for the white space between each item.
    [-]
    - bcrl 1 day ago
      Personally, I far prefer black over grey. Grey is really hard to read across a variety of lighting conditions and devices. The older you get, the more important contrast becomes.
      [-]
      - taftster 1 day ago
        Fair and good point. Sharp black bothers me, so just adding in a little hue is nice for my eyes. But that's me, of course.
        Noting also that this text is #000000 black, per the CSS. Maybe the background color helps soften it a little? Like contrast white/black is hard on the eyes, but HN is not?
nathanwallace 1 day ago
A similar site I've enjoyed for >15 years (!!) is https://techmeme.com/
I also use it's sister aggregator site for political news every day - https://www.memeorandum.com/
[-]
- dreadsword 1 day ago
  Yes - Techmeme is definitely the archetype and a great product, and I have spent lots of time there over the years as well!
MiiMe19 1 day ago
This is exactly the stuff that I think LLMs are best at. We have created the world's coolest string manipulator and this is exactly the kind of things I think LLMs are best suited for. Awesome job!
[-]
- Diti 12 hours ago
  How do you ensure the titles aren’t confabulated? I’ve used Kagi News recently and it summed up the articles about France wrong (that’s the only section in which I could reliably spot the made-up stuff).
- dreadsword 1 day ago
  Cheers and thanks for the kind words! And yes - LLMs (at least o3-mini) do a great job as my editorial team - the site is 100% automated.
lateforwork 1 day ago
Love it, but the body font (garamond) is not easy on the eyes. Garamond is one of my favorite fonts in print and at not-too-small sizes. On the screen it doesn't look good because where the characters get thin it gets too thin (or as font experts call it, too much contrast).
[-]
- dreadsword 1 day ago
  Noted, thank you! I haven't put a tonne into readability, other than some basics - I prefer a serif'd font, and I made sure the background was easier on the eyes than #FFFFFF haha
jantissler 10 hours ago
In case you are still reading this: Any plans to add RSS etc.? I might be in a small minority, but for me my feedreader is the central source. If I can't subscribe via feed, it doesn't exist for me. That's the way I'm following Techmeme and also Hacker News (with the minimum points set to 80 to show up in the feed). It's kind of annoying how many sites even in the tech field don't offer an RSS feed anymore.
[-]
- dreadsword 9 hours ago
  Still reading! Yes - RSS is on the roadmap - a /recent feed, and feeds for tags. Cheers!
vivzkestrel 20 hours ago
Amazing, lots of questions if you don't mind answering 1) is this written in python 2) if yes, does it use feedparser 3) how are you storing these feeds in the database 4) how are you handling CDATA or html based feeds that return lots of html, do you sanitize before storing or store directly in the database as a CDATA string? 5) how do you handle edge cases and anomalies across different feed providers?
[-]
- dreadsword 19 hours ago
  Send me more questions, and I'll send you more answers!
- dreadsword 19 hours ago
  Dude, its old school LAMP stack all day. I use SimpliePie to handle & sanitize feeds, storing text only (stripped of HTML). Edge cases are pretty smoothed out by simplepie!
swader999 1 day ago
Maybe I'm the only one but I would love a feed that never showed me items again that I've already scrolled past without engaging in the first time.
[-]
- facundo_olano 1 day ago
  I built that feature (auto mark as read by scrolling) into my feed reader if you’re up to self host and curate your sources
  https://github.com/facundoolano/feedi
  I did try to build a public facing news aggregator with a similar ux but I couldn’t pull it off purely based on client side state (and I didn’t want to do user management)
  [-]
  - dreadsword 21 hours ago
    I'm with you on not wanting to do user management, feature creep, user data security concerns, etc.
- stevage 1 day ago
  Me too. That's the number one thing I always wish for with every feed. And if I reach the end of the feed, that's fine.
  [-]
  - taftster 1 day ago
    Right, I'm so tired of "infinite scroll". There's a mental/emotional reward for actually reaching The End of something.
    [-]
    - dasil003 1 day ago
      I wonder how much this is a factor of the widespread mental health malaise that is often attributed to tech these days? Certainly plenty of factors to go around, but consider the connotation of "scrolling" and how common it is a default replacement to boredom in modern life and suddenly it seems quite insidious.
      [-]
      - taftster 7 hours ago
        Super insightful. I feel the same way. I can't mentally "conclude" my read for the day, because there is always just One More Article that is just under the threshold.
        An extension of Fear of Missing Out, basically. And yes, I think it causes mental exhaustion and might be directly related to some mental disorders that we have really yet to understand.
    - dreadsword 21 hours ago
      Yes - I was actually deliberate about leaving infinite scroll out; I started w/ scrolling on tag pages, for example, but switched them to paginated - largely for the feeling you described.
NetOpWibby 1 day ago
I love that your site comes with an overview instead of clicking away to another site immediately. Feels snappy and looks good. I can see this being my news roundup. Great work!
[-]
- dreadsword 1 day ago
  Cheers and thank-you!
fuddle 1 day ago
Cool site, an About page would be useful. It's hard to tell how the site works.
[-]
- dreadsword 1 day ago
  Fair enough - its honestly not something I expected anyone to be interested in enough such that an about page would be required.
  At a high level, it reads RSS feeds from a number of sources, and uses LLMs to identify clusters of stories about the same thing, group them, tag them, and designate them a "top" story or not. That's it.
  The biggest thing I've learned in all of this is that o3-mini is far and away the best at following instructions (for this use case). Periodically I'll cycle through the models available on Groq, and always come back to o3-mini.
  [-]
  - gitaarik 10 hours ago
    Very nice, I've been working on something similar, but for regular news. But I want to summarize complete articles, and RSS only provides the headlines and sometimes the first paragraph of an article.
    So I decided to write web crawlers, but then you run into CAPTCHA stuff. So I instead used Selenium to automate my browser to fetch the news articles. That worked well, but I haven't worked on it since.
    Now I'm thinking that with all these AI browsers around these days, maybe that's actually easier than doing it with Selenium. But haven't researched it properly yet.
    In any case, the LLM work of detecting whether two articles are reporting the same news, and summarizing the story, is the same in your project. So in case your project is open source, I would be interested in that part.
amatecha 1 day ago
This is probably a dumb question, but.. what does "incoming" and "outgoing" mean?
[-]
- dreadsword 1 day ago
  Oh man, don't ask - not a dumb question at all. I'll reshare what I put in another comment that answers it, but bottom line is they're a design gap in the context of /recent.
  You're right --- incoming & outgoing end up being redundant on the "Recent" view. Where they're (more) relevant is in the "Top" view where the LLM editor has picked a subset of stories to be categorized as top and incoming/outgoing are the ones that didn't make the cut, organized by timeliness.
  Definitely a gap in design!
  [-]
  - amatecha 1 day ago
    Oh, sure, but I literally just don't understand what their meaning is >_>
  - thekevan 1 day ago
    I assumed it meant stories that trended highly and were now fading in popularity (outgoing) and stories that are trending but trending quickly and may be on a fast ascent.
    Sort of a combo of "in case you missed it" and "the next new big stories".
Jaauthor 9 hours ago
Thumbs up! This low-friction aggregator beats Ground News and other aggregators by a mile for simplicity and easy to read. Nice work!
[-]
- dreadsword 4 hours ago
  Awesome--- thanks so much for the kind words!
I_Nidhi 12 hours ago
Interesting idea. Is there a context for how these stories are being picked up, or is it just to compile all the recent stories in one place?
al_borland 1 day ago
I like it. It kind of reminds me of the old Fever RSS reader, which would group together similar articles from different sources, and use that to rank how hot a story was.
[-]
- dreadsword 1 day ago
  Not familiar with fever, but there is something similar buried at the heart of mine - the LLM clusters stories, and they get promoted to public when they reach a threshold of unique sources.
  That threshold is a function of day of the week - on weekends when the news cycle is quiet, it lowers the bar --- tuesday to thursday its at its most restrictive.
ying_zh 21 hours ago
Nice work! I built a simplified version for my daily routine of searching and reading industry and research papers: https://multimodal-scout.app/. My main content sources are HN and Hugging Face trending papers, which already serve as a content filter for me instead of scraping random news. It works well for the domain I’m interested in. I’ve also made it open source, so you can tweak the RSS feed and adapt it to your needs. https://github.com/yingzha/multimodal-scout/
jhack 1 day ago
I'm REALLY liking this, way more than I thought I would. Great job! What's your stack if you don't mind my asking?
[-]
- dreadsword 1 day ago
  Awesome - glad you're enjoying it and thank you for the kind words!
  My "Stack" ---- LAMP + o3-mini for editorial tasks + Bootstrap for responsive front end. That is to say: Its old school, and painfully functional. But, light & fast.
  [-]
  - jflskajfsd 1 day ago
    [dead]
HardwareLust 10 hours ago
So the question really is; What is your "signal-to-noise test"?
_menelaus 1 day ago
This is pretty cool man. How do you cluster the articles into stories? It looks like you did a good job of it.
[-]
- dreadsword 1 day ago
  Thanks so much for the kind words - its 100% o3-mini for clustering. I have zero editorial input as to what constitutes a cluster, what's "top" news, etc.
  The one subtlety is setting up the LLM to understand whether a new story belongs in an existing cluster, or with > 1 neighbors, constitutes a new cluster. The challenge there is scoping the clustering window (hours of stories for consideration) and topic breadth to avoid creating Katamari-super-clusters that just end up with every story associated to them.
  At this point I seem to have found a sweet spot re: the hours window, the frequency of processing, and the design of the prompt such that its working consistently.
  Very few false positives in terms of spurious clusters being created, or potential clusters being missed.
dotneter 19 hours ago
I am trying to build something similar but for the tech articles https://fooqux.com/
tiaremnt 1 day ago
I just configured my own rss website to only find this awesome solution. I’m crying right now if only it found you earlier I would have saved me so much time. Also do you have the code publicly available so that I can customize for my own needs?
embit 1 day ago
I have done similar style for tech news. Aggravating based on Tags. That way I can read tech news on micro topics. https://embit.ca/ Your feedback is appreciated.
[-]
- dreadsword 1 day ago
  Looking good - keep at it! Use it yourself every day, and iterate it continuously.
stevage 1 day ago
I would love this but with more blogs and less product announcements.
[-]
- dreadsword 1 day ago
  I hear you - today was heavier on product announcements than normal I feel. And re: blogs - for sure: send me suggestions and I'll add them...!
  [-]
  - stevage 23 hours ago
    https://github.com/surprisetalk/blogs.hn/blob/main/blogs.jso...
    [-]
    - dreadsword 3 hours ago
      Cheers and thanks for sharing - that's a big list!
sweenzor 1 day ago
Very cool. Having an immutable record "time machine" you can use to re-find something you remember reading is very humane. I'd love to see this for world news, politics, etc.
[-]
- dreadsword 1 day ago
  Ah cool! it is built to be extensible, and I'll give you a preview of another vertical here: https://northfeed.ca/
  And - did you actually see the time machine at the bottom of the right hand column? Or - was that just a wish list item of yours?
  [-]
  - jains99 23 hours ago
    actually both websites looks same..how? based on same template?
hackncheese 1 day ago
Combines the strength of AI at summarizing text and easy access to the actual information sources for verification, well done well done!
[-]
- dreadsword 1 day ago
  Cheers and thank you very much --- yes, LLMs are very well suited to editorial tasks!
wccrawford 1 day ago
Wow. That's amazing! I've bookmarked it because I think it's one of the best news sites I've seen now.
[-]
- dreadsword 1 day ago
  Well thank you so much for the very kind words, and don't hesitate to reach out with any feedback!
chicagojoe 1 day ago
This is great! Are you using a news API or pulling in RSS feeds yourself? Is there a list of what sources are included?
[-]
- dreadsword 3 hours ago
  Circling back to this: https://deadstack.net/sources
  Note that the top news breaker is: The Verge, having broken about 10% of stories on my site; TechCrunch is next at 8, followed by ... MacRumours at 7.
- dreadsword 1 day ago
  Reading RSS myself, OLD SCHOOL: Cron Jobs. PHP. Hahaha! List of sources: At present, no; but if its of interest, it would not be hard to add.
  I should also add - please post any recommendations re: sources to cover.
- dreadsword 1 day ago
  Hey - still thinking about sources here. With the data I have, I could actually do an interesting analysis of news sources - i.e.:
  - how often do their stories become members of clusters? - how "fast" are they to publish on a topic vs. other competitors - i.e.: who "breaks" the news? - what tags (people, companies, topics) does a given source stick close? Which do they shy away from?
  Thanks very much for a really interesting set of ideas to explore!
wmeredith 1 day ago
I've built a couple different versions of this for myself over the years. I like yours! Thanks for sharing.
[-]
- dreadsword 1 day ago
  Cheers - thank you for the kind words!
mariusor 17 hours ago
If I'm not logged in you don't need cookies. Being blasted in the face with a cookie banner as the first thing on a web page is very disrespectful.
jiwidi 1 day ago
Pretty cool! How do you do to build these "stories" based on news?
[-]
- dreadsword 1 day ago
  Cheers and thank you! I'll reshare an earlier comment that I think answers your question - let me know:
  Thanks so much for the kind words - its 100% o3-mini for clustering. I have zero editorial input as to what constitutes a cluster, what's "top" news, etc.
  The one subtlety is setting up the LLM to understand whether a new story belongs in an existing cluster, or with > 1 neighbors, constitutes a new cluster. The challenge there is scoping the clustering window (hours of stories for consideration) and topic breadth to avoid creating Katamari-super-clusters that just end up with every story associated to them.
  At this point I seem to have found a sweet spot re: the hours window, the frequency of processing, and the design of the prompt such that its working consistently.
  Very few false positives in terms of spurious clusters being created, or potential clusters being missed.
  [-]
  - jiwidi 1 day ago
    Very interesting, how do you do that? Do you limit yourself what you feed or via custom instructions? I had a similar case so would love how you are doing the prompting here.
    In my case we went with embeddings and clustering to find close papers to each other because llm were allucinating.
clueless 1 day ago
looks very similar to https://particle.news how would you distinguish your approach (other than the tech focus)?
[-]
- dreadsword 1 day ago
  Cool - particle looks great - I really like how visual it is.
  Distinguishing characteristics - personally I get value from the unambiguous timeline (no editorializing in /recent), and (as nice as the visual is) the non-visual, super simplistic presentation & the curated sources (...which I value b/c I curated them myself haha).
  So bottom line is that DS will appeal to a certain kind of obsessive compulsive news consumer and synthesizer that wants the right balance of signal to noise ands a streamlined presentation that doesn't slow them down. I count myself among that group!
econ 1 day ago
Clicking "more" for a few extra words feels wrong.
[-]
- dreadsword 1 day ago
  Where are you seeing that?
Biologist123 1 day ago
Great idea! May I ask what the information source is?
[-]
- dreadsword 1 day ago
  Individual feeds from ~100 sites!
productiveminds 1 day ago
Simplistic site, looks clean and easy to navigate!
[-]
- dreadsword 1 day ago
  Many thanks for the kind words!
neilellis 1 day ago
Really good, clean and to the point, love it.
[-]
- dreadsword 1 day ago
  Cheers - thank-you so much!
botanrice 1 day ago
This is neat! Thanks for sharing.
[-]
- dreadsword 1 day ago
  Cheers and thank you!
p-s-v 1 day ago
cool, how did you create it? whats the architecture like ?
[-]
- dreadsword 1 day ago
  Thanks very much! Architecture - is truly recidivistic - LAMP, cron jobs, o3-mini, bootstrap. It works, its fast because its not complicated, and b/c I'm doing things like updating hourly vs. real time.
iJohnDoe 18 hours ago
Very impressive! I like it and continued to browse the content. Will be added to my daily list of sites to check out.
[-]
- dreadsword 3 hours ago
  Awesome - thanks so much for checking it out!
  Cheers!
tamimio 1 day ago
If you can get rid of the cookies message that would be great, as I will place the site as an app in my phone and that message is annoying to have when I open it.
[-]
- dreadsword 1 day ago
  You should only see that message once when you first show up, and as annoying as it is, there's a compliance element to it. Let me know if its persisting for you after accepting!
jarmitage 1 day ago
RSS?
[-]
- dreadsword 1 day ago
  In the roadmap! RSS by tag - i.e.: for https://deadstack.net/tag/quantum And an RSS feed for /recent are both in progress
metalliqaz 1 day ago
Really nice and clean, well done.
What is the purpose of having summaries for "Recent", "Incoming", and "Outgoing" all at the top? Seems like all content from the later two are in the first, right?
[-]
- dreadsword 1 day ago
  You're right --- incoming & outgoing end up being redundant on the "Recent" view.
  Where they're (more) relevant is in the "Top" view where the LLM editor has picked a subset of stories to be categorized as top and incoming/outgoing are the ones that didn't make the cut, organized by timeliness.
  Definitely a gap in design!
- dreadsword 1 day ago
  Oh I should add: incoming will show stories ~20 minutes before they get picked up for "Top" inclusion, if they're going to make the cut, based on how jobs are scheduled.