[R] Search and chat interface for R-help and R-devel
Jeffrey Dick
j3||d|ck @end|ng |rom gm@||@com
Mon Jan 5 03:23:51 CET 2026
Dear R users,
I'm excited to announce R-help-chat, a search and chat interface for the R-help and R-devel mailing lists. The app is available at https://huggingface.co/spaces/jedick/R-help-chat.
The backend is a retrieval-augmented generation (RAG) system with LLM invocation at two steps: first the LLM analyzes the user's question to construct a search query, then it uses the retrieved emails to generate a response. The search tool is a hybrid of semantic search (vector embeddings) and lexical search (keyword search using a version of BM25).
The system supports multiple searches in a single turn (useful for comparing different topics or time periods) and chat memory for asking follow-up questions with previous conversational turns as context.
The retrieved emails are shown in the app below the chatbot. Note that preprocessing removes quoted lines (beginning with ">") to reduce redundancy in the database and the search tool is configured to return up to six emails per search. Citations for the LLM's response are also shown.
The app itself is free software and the source code is available on GitHub (https://github.com/jedick/R-help-chat). However, the LLM it currently uses is a commercial product from OpenAI. I have enabled sharing of inputs and outputs with OpenAI to access complimentary daily tokens.
Further regarding privacy and logging, I do not have access to server logs for the deployed app. I collect usage information with print statements like this:
2026-01-05T01:23:42 - Set R-help graph for session lrs8onnysyl
2026-01-05T01:24:04 - Get graph for session lrs8onnysyl
2026-01-05T01:56:38 - Delete graph for session lrs8onnysyl
2026-01-05T01:56:41 - Set R-devel graph for session lrs8onnysyl
2026-01-05T01:57:07 - Get graph for session lrs8onnysyl
2026-01-05T01:57:19 - Get graph for session lrs8onnysyl
2026-01-05T02:29:38 - Delete graph for session lrs8onnysyl
The session is a random identifier for page visits. The "Set graph" and "Get graph" logs denote the start of a chat and input to an existing chat and the "Delete graph" log occurs when the user changes the mailing list or closes or browses away from the page.
The email database covers the R-help and R-devel list archives from January 2015 to December 2025. I plan to make monthly updates to the database. I'm open to suggestions about adding other lists or longer time periods and welcome any other feedback.
I hope this will be a useful tool for list subscribers and the wider R community.
Regards,
Jeff Dick
More information about the R-help
mailing list