RSSCopilot Requirements Specification

Table of Contents

Overview

Project Description

RSSCopilot is a Zotero plugin that enhances the built-in RSS functionality by providing intelligent relevance analysis of RSS feed items based on the user’s existing library. It uses semantic analysis of paper titles to help researchers quickly identify the most relevant new publications.

Target Users

  • Academic researchers

  • Graduate students

  • Research professionals

  • Anyone who uses Zotero’s RSS feature for academic literature tracking

Product Vision

Mission Statement

To help researchers efficiently discover relevant academic papers from RSS feeds by providing intelligent, personalized relevance analysis based on their existing research interests.

Core Values

  • Simplicity: Focus on core functionality without overwhelming users

  • Integration: Seamless integration with Zotero’s existing RSS features

  • Intelligence: Smart relevance analysis using modern language models

  • Efficiency: Quick and accurate identification of relevant papers

User Stories

Core User Stories

  1. As a researcher, I want to quickly identify papers relevant to my research so that I don’t miss important publications.

  2. As a user, I want to see relevance scores for RSS items so that I can prioritize my reading.

  3. As a reader, I want to sort RSS items by relevance so that I can focus on the most important papers first.

Configuration User Stories

  1. As a user, I want to choose my preferred language model so that I can optimize the relevance analysis.

  2. As a user, I want to set relevance thresholds so that I can customize what is considered highly relevant.

Functional Requirements

Core Features

1. Relevance Analysis

  • Calculate relevance scores (0-100%) for RSS items based on title similarity

  • Support multiple language models for vector encoding

  • Provide visual indicators for different relevance levels

    • 🔥 High relevance (90-100%)

    • 📈 Medium relevance (70-89%)

    • 📋 Low relevance (<70%)

2. User Interface

  • Add relevance column to RSS feed view

  • Support sorting by relevance score

  • Provide quick actions for item management

    • Save to library

    • Mark for later

    • Ignore

3. Configuration

  • Language model selection

  • Relevance threshold customization

  • Vector database management

Optional Features

  • Model comparison tools

  • Relevance trend analysis

  • Batch processing tools

Technical Requirements

System Architecture

Database Schema

-- papers: Store paper information
CREATE TABLE papers (
    id TEXT PRIMARY KEY,
    title TEXT NOT NULL,
    added_date INTEGER,
    last_updated INTEGER
);

-- paper_vectors: Store vectors for different models
CREATE TABLE paper_vectors (
    paper_id TEXT,
    model_id TEXT,
    vector BLOB NOT NULL,  -- dimension(384)
    created_date INTEGER,
    PRIMARY KEY (paper_id, model_id),
    FOREIGN KEY (paper_id) REFERENCES papers(id),
    FOREIGN KEY (model_id) REFERENCES models(id)
);

-- paper_vectors_index: Vector similarity search index
CREATE VIRTUAL TABLE paper_vectors_index USING vss0(
    vector(384)
);

-- rss_items: Store RSS items
CREATE TABLE rss_items (
    id TEXT PRIMARY KEY,
    title TEXT NOT NULL,
    created_date INTEGER
);

-- rss_relevance: Store relevance scores and vectors
CREATE TABLE rss_relevance (
    item_id TEXT,
    model_id TEXT,
    vector BLOB NOT NULL,  -- dimension(384)
    history_vectors_space_svd_u BLOB NOT NULL,  -- dimension(384)
    recent_vectors_space_svd_u BLOB NOT NULL,  -- dimension(384)
    history_relevance_score REAL,
    recent_relevance_score REAL,
    updated_date INTEGER,
    PRIMARY KEY (item_id, model_id),
    FOREIGN KEY (item_id) REFERENCES rss_items(id),
    FOREIGN KEY (model_id) REFERENCES models(id)
);

-- rss_relevance_index: Vector similarity search index
CREATE VIRTUAL TABLE rss_relevance_index USING vss0(
    vector(384),
    history_vectors_space_svd_u(384),
    recent_vectors_space_svd_u(384)
);

-- models: Store model configurations
CREATE TABLE models (
    id TEXT PRIMARY KEY,
    name TEXT NOT NULL,
    api_config TEXT,
    is_active BOOLEAN DEFAULT true,
    created_date INTEGER
);

Core Processes

Initialization Process

  1. Plugin installation and configuration

  2. Language model selection

  3. Initial library scan

  4. Vector generation

  5. LSA space construction

RSS Processing

  1. Monitor Zotero RSS updates

  2. Process new items

  3. Calculate relevance scores

  4. Update UI

Multi-Model Support

  • Independent vector storage per model

  • Smooth model switching

  • Parallel model operation

  • Historical vector preservation

User Interface

RSS Feed View

[RSS Feed List]
Title                    Journal         Relevance    Actions
Deep Learning...        Nature          95% 🔥       [Save] [Later]
Neural Networks...      Science         75% 📈       [Save] [Later]
Quantum Computing...    Nature          45% 📋       [Save] [Later]

Configuration Interface

[RSSCopilot Settings]
├─ Model Service
│  ├─ Provider: [Dropdown]
│  └─ API Settings: [Input]
│
├─ Relevance Thresholds
│  ├─ High: [90%] 🔥
│  ├─ Medium: [70%] 📈
│  └─ Low: [50%] 📋
│
└─ Data Management
   ├─ Rebuild Index
   └─ Clear Cache

Data Architecture

Performance Requirements

Response Time

  • UI updates: < 100ms

  • Relevance calculation: < 500ms per item

  • Batch processing: < 5s for 100 items

Resource Usage

  • Memory: < 200MB additional RAM usage

  • Storage: < 1GB for vector database

  • CPU: < 10% during normal operation

Scalability

  • Support up to 100,000 papers in library

  • Handle up to 1,000 RSS items per feed

  • Process up to 10 concurrent RSS updates

Security Requirements

Data Protection

  • Secure storage of API keys

  • Local-only processing of paper data

  • Encrypted vector storage

Privacy

  • No data sharing with external services

  • Optional telemetry collection

  • Clear data retention policies

Implementation Timeline

Phase 1 (2 weeks)

  • Database implementation

  • Language model integration

  • Basic vector processing

Phase 2 (2 weeks)

  • RSS monitoring system

  • UI component development

  • Relevance calculation

Phase 3 (1 week)

  • Performance optimization

  • Configuration interface

  • Error handling

Success Metrics

  • User adoption rate

  • Relevance accuracy

  • System performance

  • User satisfaction

Appendix

Glossary

  • LSA: Latent Semantic Analysis

  • Vector Encoding: Process of converting text to numerical vectors

  • Relevance Score: Numerical measure of content similarity

References

  • Zotero API Documentation

  • Language Model Documentation

  • SQLite Documentation