Initial commit

This commit is contained in:
BinaryBeastMaster 2025-05-10 15:09:23 -07:00
commit 25abcd2974
57 changed files with 15743 additions and 0 deletions

30
.gitattributes vendored Normal file
View file

@ -0,0 +1,30 @@
# Set default behavior, in case users don't have core.autocrlf set.
* text=auto
# Explicitly declare text files you want to always normalize and convert to LF in the repo
*.js text eol=lf
*.ts text eol=lf
*.html text eol=lf
*.css text eol=lf
*.json text eol=lf
*.md text eol=lf
*.xml text eol=lf
*.yaml text eol=lf
*.yml text eol=lf
*.svg text eol=lf
# Declare files that will always have CRLF line endings on checkout.
# *.bat text eol=crlf
# *.cmd text eol=crlf
# *.ps1 text eol=crlf
# Declare files that should not be touched (binary files)
*.png binary
*.jpg binary
*.jpeg binary
*.gif binary
*.ico binary
*.pdf binary
*.zip binary
*.gz binary
*.tgz binary

24
.gitignore vendored Normal file
View file

@ -0,0 +1,24 @@
# Dependencies
node_modules/
npm-debug.log
yarn-debug.log
yarn-error.log
# Build outputs
dist/
build/
*.tsbuildinfo
# Environment variables
.env
.env.local
.env.development.local
.env.test.local
.env.production.local
# IDE and editor files
.idea/
.vscode/
*.swp
*.swo
.DS_Store

205
LICENSE Normal file
View file

@ -0,0 +1,205 @@
GNU AFFERO GENERAL PUBLIC LICENSE
Version 3, 19 November 2007
Copyright © 2007 Free Software Foundation, Inc. <https://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
Preamble
The GNU Affero General Public License is a free, copyleft license for software and other kinds of works, specifically designed to ensure cooperation with the community in the case of network server software.
The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, our General Public Licenses are intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users.
When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things.
Developers that use our General Public Licenses protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License which gives you legal permission to copy, distribute and/or modify the software.
A secondary benefit of defending all users' freedom is that improvements made in alternate versions of the program, if they receive widespread use, become available for other developers to incorporate. Many developers of free software are heartened and encouraged by the resulting cooperation. However, in the case of software used on network servers, this result may fail to come about. The GNU General Public License permits making a modified version and letting the public access it on a server without ever releasing its source code to the public.
The GNU Affero General Public License is designed specifically to ensure that, in such cases, the modified source code becomes available to the community. It requires the operator of a network server to provide the source code of the modified version running there to the users of that server. Therefore, public use of a modified version, on a publicly accessible server, gives the public access to the source code of the modified version.
An older license, called the Affero General Public License and published by Affero, was designed to accomplish similar goals. This is a different license, not a version of the Affero GPL, but Affero has released a new version of the Affero GPL which permits relicensing under this license.
The precise terms and conditions for copying, distribution and modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU Affero General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based on the Program.
To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work.
A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work.
The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source.
The Corresponding Source for a work in source code form is that same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures.
When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified it, and giving a relevant date.
b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices".
c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so.
A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways:
a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b.
d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d.
A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product.
"Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made.
If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM).
The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or authors of the material; or
e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors.
All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11).
However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice.
Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party.
If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it.
A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program.
13. Remote Network Interaction; Use with the GNU General Public License.
Notwithstanding any other provision of this License, if you modify the Program, your modified version must prominently offer all users interacting with it remotely through a computer network (if your version supports such interaction) an opportunity to receive the Corresponding Source of your version by providing access to the Corresponding Source from a network server at no charge, through some standard or customary means of facilitating copying of software. This Corresponding Source shall include the Corresponding Source for any work covered by version 3 of the GNU General Public License that is incorporated pursuant to the following paragraph.
Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the work with which it is combined will remain governed by version 3 of the GNU General Public License.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of the GNU Affero General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU Affero General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU Affero General Public License, you may choose any version ever published by the Free Software Foundation.
If the Program specifies that a proxy can decide which future versions of the GNU Affero General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program.
Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS

230
README.md Normal file
View file

@ -0,0 +1,230 @@
# Chat Relay: OpenAI-Compatible Relay for AI Chat Interfaces
Chat Relay is a system that allows Cline/RooCode to communicate with web-based AI chat interfaces (like Gemini, AI Studio, ChatGPT, and Claude) through an OpenAI-compatible API. This enables using models that may not have public APIs through their web interfaces. It also alows you to use models that are fast or good at tools (Claude) in combination with slower smarter models (Google Pro 2.5).
---
## Architecture Overview
The system consists of three main components:
1. **OpenAI-Compatible API Server**: Implements an OpenAI-compatible API endpoint that Cline/RooCode can connect to.
2. **Browser Extension**: Connects to the API server via WebSocket and interacts with the chat interface.
3. **MCP Server**: Provides additional tools and resources for the system.
### ASCII Diagram
```
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌──────────────┐
│ │ │ │ │ │ │ │
│ Cline/ │◄── http ──►│ OpenAI- │◄─ websocket -►│ Browser │◄───────►│ Chat │
│ RooCode │ │ Compatible │ │ Extension │ │ Interface │
│ (App) │ │ Relay │ │ │ │ (Gemini, │
└─────────────┘ └─────────────┘ └─────────────┘ │ AI Studio, │
│ ChatGPT, │
│ Claude) │
└──────────────┘
```
### Mermaid Diagram
```mermaid
graph TD
A["Client / RooCode"] -->|HTTP POST /chat/completions| B["API Relay Server"]
B -->|WebSocket| C["Browser Extension"]
C -->|User Input| D["Chat Interface
(Gemini / AI Studio / ChatGPT / Claude)"]
D -->|AI Response| C
C -->|Captured Response| B
B -->|OpenAI-format JSON| A
```
---
## Data Flow
1. **Cline/RooCode to API Server**:
- Sends HTTP POST to `/v1/chat/completions`
2. **API Server to Browser Extension**:
- Sends message + `requestId` via WebSocket
3. **Extension to Chat Interface**:
- Inserts text, clicks send
4. **Chat Interface to Extension**:
- Captures response from UI
5. **Extension to API Server**:
- Returns response tied to `requestId`
6. **API Server to Cline/RooCode**:
- Formats response in OpenAI structure
---
## Components
### 1. OpenAI-Compatible API Server
The API server implements an OpenAI-compatible endpoint and manages browser extension connectivity.
**Key Features:**
- OpenAI-style `/v1/chat/completions` endpoint
- WebSocket server for real-time relay
- Ping/pong health checks
- Request timeouts, retry handling
- Tracks connection state
### 2. Browser Extension
Interacts with Gemini, AI Studio, ChatGPT, and Claude UIs. Injects chat, captures responses.
**Key Features:**
- Auto-send with retry/backoff
- DOM/debugger-based response capture
- Modular provider architecture:
- [`AIStudioProvider`](extension/providers/aistudio.js)
- [`ChatGptProvider`](extension/providers/chatgpt.js)
- [`ClaudeProvider`](extension/providers/claude.js)
**Supported Chat Interfaces:**
- Gemini (`gemini.google.com`)
- AI Studio (`aistudio.google.com`)
- ChatGPT (`chatgpt.com`)
- Claude (`claude.ai`)
ChatGPT is a trademark of OpenAI. Gemini and AI Studio are trademarks of Google. Claude is a trademark of Anthropic. This project is not affiliated with, endorsed by, or sponsored by OpenAI, Google, or Anthropic.
### 3. MCP Server
An optional developer utility server for simulating messages, testing extensions, or viewing traffic.
---
## Installation & Setup
### Prerequisites
- Node.js (v14+)
- npm (v6+)
- Chrome browser
### API Server Setup
```bash
cd api-relay-server
npm install
nodemon start
```
### Browser Extension Setup
1. Open Chrome → `chrome://extensions/`
2. Enable Developer Mode
3. Click "Load unpacked"
4. Select `extension/` directory
### MCP Server (Optional)
```bash
cd mcp-server
npm run build
npm install -g .
```
or package installation and install
```bash
cd mcp-server
npm run build
npm pack
npm install -g C:/Users/user/Projects/chat-relay/mcp-server/chat-relay-mcp-0.0.1.tgz
```
<p></p>
<p>
<img src="images/using-mcp.jpg" alt="Description">
</p>
---
## Usage
### Configuring Cline/RooCode
1. Open settings → API Provider: OpenAI Compatible
2. Base URL: `http://localhost:3003`
3. API Key: Any value (not validated)
4. Model ID: `gemini-pro`, `chatgpt`, `claude-3-sonnet`, or any label
<p></p>
<p>
<img src="images/using-roo-cline.jpg" alt="Description">
</p>
### Using the System
1. Start the API Server
2. Open Gemini, AI Studio, ChatGPT, or Claude in Chrome
3. Use Cline/RooCode to send messages
4. Responses are relayed through the extension
### Testing
```bash
curl -X POST http://localhost:3003/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-pro",
"messages": [{ "role": "user", "content": "Hello!" }],
"temperature": 0.7,
"max_tokens": 100
}'
```
---
## Configuration
### API Server
Edit [`api-relay-server/src/server.js`](api-relay-server/src/server.js):
- `PORT` (default: 3003)
- `REQUEST_TIMEOUT` (default: 180000ms)
- `PING_INTERVAL` (default: 30000ms)
- `CONNECTION_TIMEOUT` (default: 45000ms)
API chat relay server admin panel can be found at http://localhost3003/admin for configuring these values as reviewing message flow.
### Browser Extension
Edit [`extension/background.js`](extension/background.js):
- `serverHost`, `serverPort`, `serverProtocol`
- `reconnectInterval`
---
## License
This project is licensed under the GNU Affero General Public License v3.0.
See the [LICENSE](LICENSE) file for details.
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL_v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)
---
## Troubleshooting
**API Server**
- Check port availability
- If using Cline/Roo, Check chat relay provider config and ensure:
- Base URL is http://localhost:3003/v1 (adjust port if changed form default)
- R1 model parameters are enabled
- Streaming is not enabled
- Inspect logs for request mapping or timeout issues
**Browser Extension**
- Confirm connection to `ws://localhost:3003`
- Check `content.js` for updated selectors if UI changes
- Reload browser extension then refresh chat page
**WebSocket Issues**
- Try increasing `PING_INTERVAL` or `CONNECTION_TIMEOUT`
- Verify firewall or proxy settings

1465
api-relay-server/package-lock.json generated Normal file

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,32 @@
{
"name": "api-relay-server",
"version": "1.0.0",
"main": "dist/index.js",
"scripts": {
"build": "tsc",
"start": "node dist/index.js",
"dev": "nodemon --watch src --ext ts --exec \"npm run build && npm start\"",
"prestart": "npm run build",
"test": "echo \"Error: no test specified\" && exit 1"
},
"keywords": [],
"author": "",
"license": "ISC",
"description": "",
"dependencies": {
"body-parser": "^2.2.0",
"cors": "^2.8.5",
"express": "^5.1.0",
"ioredis": "^5.6.1",
"ws": "^8.18.2"
},
"devDependencies": {
"@types/body-parser": "^1.19.5",
"@types/cors": "^2.8.17",
"@types/express": "^5.0.1",
"@types/ioredis": "^4.28.10",
"@types/ws": "^8.18.1",
"nodemon": "^3.1.10",
"typescript": "^5.8.3"
}
}

View file

@ -0,0 +1,446 @@
<!--
Chat Relay: Relay for AI Chat Interfaces
Copyright (C) 2025 Jamison Moore
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see https://www.gnu.org/licenses/.
-->
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Chat Relay Admin</title>
<style>
body { font-family: Arial, sans-serif; margin: 0; padding: 0; background-color: #f4f4f4; color: #333; }
header { background-color: #333; color: #fff; padding: 1em; text-align: center; }
nav { background-color: #444; padding: 0.5em; }
nav ul { list-style-type: none; padding: 0; margin: 0; text-align: center; }
nav ul li { display: inline; margin-right: 20px; }
nav ul li a { color: #fff; text-decoration: none; font-weight: bold; }
nav ul li a.active { text-decoration: underline; }
.container { padding: 1em; }
.tab-content { display: none; }
.tab-content.active { display: block; }
h2 { border-bottom: 2px solid #333; padding-bottom: 0.5em; }
table { width: 100%; border-collapse: collapse; margin-top: 1em; }
th, td { border: 1px solid #ddd; padding: 8px; text-align: left; }
th { background-color: #555; color: white; }
.log-window {
background-color: #222;
color: #0f0;
font-family: 'Courier New', Courier, monospace;
padding: 10px;
height: 200px;
overflow-y: scroll;
border: 1px solid #444;
margin-top: 1em;
}
.log-entry { white-space: pre-wrap; }
.collapsible-header { background-color: #555; color: white; padding: 0.5em; cursor: pointer; text-align: center; }
</style>
</head>
<body>
<header>
<h1>Chat Relay Admin Dashboard</h1>
</header>
<nav>
<ul>
<li><a href="#" class="tab-link active" data-tab="messages">Messages</a></li>
<li><a href="#" class="tab-link" data-tab="settings">Settings</a></li>
<li><a href="#" class="tab-link" data-tab="status">Status</a></li>
</ul>
</nav>
<div class="container">
<div id="messages" class="tab-content active">
<h2>Message History <button id="refresh-messages-btn" style="font-size: 0.8em; margin-left: 10px;">Refresh Messages</button></h2>
<table>
<thead>
<tr>
<th>Start Timestamp</th>
<th>End Timestamp</th>
<th>Request ID</th>
<th>From Client (Cline)</th>
<th>To Extension</th>
<th>From Extension</th>
<th>To Client (Cline)</th>
<th>Status</th>
</tr>
</thead>
<tbody id="message-history-body">
<!-- Message rows will be inserted here by JavaScript -->
</tbody>
</table>
</div>
<div id="settings" class="tab-content">
<h2>Configuration Settings</h2>
<div id="settings-content">
<div>
<label for="port-input">Server Port: </label>
<input type="number" id="port-input" style="width: 80px;">
<small>(Requires server restart to apply)</small>
</div>
<div style="margin-top: 0.5em;">
<label for="request-timeout-input">Request Timeout (ms): </label>
<input type="number" id="request-timeout-input" style="width: 100px;">
</div>
<div style="margin-top: 0.5em;">
<label>New Request Behavior (if extension busy):</label>
<div>
<input type="radio" id="newRequestBehaviorQueue" name="newRequestBehavior" value="queue" checked>
<label for="newRequestBehaviorQueue">Queue</label>
</div>
<div>
<input type="radio" id="newRequestBehaviorDrop" name="newRequestBehavior" value="drop">
<label for="newRequestBehaviorDrop">Drop</label>
</div>
</div>
<button id="save-settings-btn" style="margin-top: 1em; margin-bottom: 0.5em;">Save Settings</button>
<span id="update-status-msg" style="margin-left: 10px; font-style: italic;"></span>
<p style="margin-top: 1em;">Ping Interval (ms): <span id="setting-ping-interval"></span></p>
</div>
</div>
<div id="status" class="tab-content">
<h2>Server Status</h2>
<div id="status-content">
<p>Server Uptime: <span id="status-uptime">N/A</span></p>
<p>Connected Extensions: <span id="status-connected-extensions">0</span></p>
<button id="restart-server-btn" style="margin-top: 1em; padding: 0.5em 1em; background-color: #d9534f; color: white; border: none; cursor: pointer;">Restart Server</button>
</div>
</div>
</div>
<div class="collapsible-header" onclick="toggleLogWindow()">Server Logs (click to toggle)</div>
<div class="log-window" id="log-window-content" style="display: none;">
<!-- Log entries will be inserted here -->
</div>
<script>
// Basic tab switching
const tabLinks = document.querySelectorAll('.tab-link');
const tabContents = document.querySelectorAll('.tab-content');
tabLinks.forEach(link => {
link.addEventListener('click', (e) => {
e.preventDefault();
tabLinks.forEach(l => l.classList.remove('active'));
tabContents.forEach(tc => tc.classList.remove('active'));
link.classList.add('active');
const activeTabContent = document.getElementById(link.dataset.tab);
activeTabContent.classList.add('active');
// If settings or status tab is activated, refresh their content
if (link.dataset.tab === 'settings' || link.dataset.tab === 'status') {
fetchAndDisplayServerInfo();
}
});
});
function toggleLogWindow() {
const logWindow = document.getElementById('log-window-content');
if (logWindow.style.display === 'none') {
logWindow.style.display = 'block';
} else {
logWindow.style.display = 'none';
}
}
const messageHistoryBody = document.getElementById('message-history-body');
const refreshButton = document.getElementById('refresh-messages-btn');
function createPreCell(data) {
const cell = document.createElement('td');
if (data === undefined || data === null) {
cell.textContent = 'N/A';
} else {
const pre = document.createElement('pre');
pre.style.margin = '0';
pre.style.whiteSpace = 'pre-wrap';
pre.style.maxHeight = '200px'; // Added max height
pre.style.overflowY = 'auto'; // Added scrollability
pre.textContent = JSON.stringify(data, null, 2);
cell.appendChild(pre);
}
return cell;
}
async function fetchAndDisplayMessageHistory() {
try {
const response = await fetch('/v1/admin/message-history');
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
const messages = await response.json();
messageHistoryBody.innerHTML = ''; // Clear existing rows
if (messages.length === 0) {
const row = messageHistoryBody.insertRow();
const cell = row.insertCell();
cell.colSpan = 8; // Adjusted for new column
cell.textContent = 'No message history found.';
cell.style.textAlign = 'center';
return;
}
// Group messages by requestId
const groupedMessages = messages.reduce((acc, logEntry) => {
const id = logEntry.requestId;
if (!acc[id]) {
acc[id] = {
requestId: id,
startTimestamp: logEntry.timestamp, // Default to first seen
endTimestamp: undefined, // Initialize endTimestamp
fromClient: undefined,
toExtension: undefined,
fromExtension: undefined,
toClient: undefined,
status: "Unknown"
};
}
// Update fields based on log type
switch (logEntry.type) {
case 'CHAT_REQUEST_RECEIVED':
// Ensure startTimestamp is the earliest one if multiple CHAT_REQUEST_RECEIVED logs existed (though unlikely for same ID)
if (!acc[id].startTimestamp || new Date(logEntry.timestamp) < new Date(acc[id].startTimestamp)) {
acc[id].startTimestamp = logEntry.timestamp;
}
acc[id].fromClient = logEntry.data.fromClient;
acc[id].toExtension = logEntry.data.toExtension;
if (acc[id].status === "Unknown" || acc[id].status === "Request In Progress") {
acc[id].status = "Request In Progress";
}
break;
case 'CHAT_RESPONSE_SENT':
acc[id].fromExtension = logEntry.data.fromExtension;
acc[id].toClient = logEntry.data.toClient;
acc[id].status = logEntry.data.status || "Success";
acc[id].endTimestamp = logEntry.timestamp; // Set end time on success
break;
case 'CHAT_ERROR_RESPONSE_SENT':
acc[id].toClient = logEntry.data.toClientError;
acc[id].status = logEntry.data.status || "Error";
acc[id].endTimestamp = logEntry.timestamp; // Set end time on error
break;
}
return acc;
}, {});
// Convert grouped messages object to an array and sort by timestamp (most recent first)
const consolidatedMessages = Object.values(groupedMessages).sort((a, b) => {
// Sort by startTimestamp, most recent first
return new Date(b.startTimestamp) - new Date(a.startTimestamp);
});
if (consolidatedMessages.length === 0) {
const row = messageHistoryBody.insertRow();
const cell = row.insertCell();
cell.colSpan = 8; // Adjusted for new column
cell.textContent = 'No consolidated message history to display.';
cell.style.textAlign = 'center';
return;
}
consolidatedMessages.forEach(msg => {
const row = messageHistoryBody.insertRow();
row.insertCell().textContent = new Date(msg.startTimestamp).toLocaleString();
row.insertCell().textContent = msg.endTimestamp ? new Date(msg.endTimestamp).toLocaleString() : (msg.status === "Request In Progress" ? "In Progress" : "N/A");
row.insertCell().textContent = msg.requestId;
row.appendChild(createPreCell(msg.fromClient));
row.appendChild(createPreCell(msg.toExtension));
row.appendChild(createPreCell(msg.fromExtension));
row.appendChild(createPreCell(msg.toClient));
row.insertCell().textContent = msg.status;
});
} catch (error) {
console.error('Error fetching message history:', error);
messageHistoryBody.innerHTML = '';
const row = messageHistoryBody.insertRow();
const cell = row.insertCell();
cell.colSpan = 7;
cell.textContent = `Error loading message history: ${error.message}`;
cell.style.color = 'red';
cell.style.textAlign = 'center';
}
}
if (refreshButton) {
refreshButton.addEventListener('click', fetchAndDisplayMessageHistory);
}
fetchAndDisplayMessageHistory(); // Initial load for messages
// Elements for settings and status
const portInputEl = document.getElementById('port-input'); // Corrected ID
const requestTimeoutInputEl = document.getElementById('request-timeout-input');
const newRequestBehaviorQueueEl = document.getElementById('newRequestBehaviorQueue');
const newRequestBehaviorDropEl = document.getElementById('newRequestBehaviorDrop');
const saveSettingsBtn = document.getElementById('save-settings-btn');
const updateStatusMsgEl = document.getElementById('update-status-msg');
const settingPingIntervalEl = document.getElementById('setting-ping-interval');
// References for status elements (declared once)
const statusUptimeEl = document.getElementById('status-uptime');
const statusConnectedExtensionsEl = document.getElementById('status-connected-extensions');
const restartServerBtn = document.getElementById('restart-server-btn');
function formatUptime(totalSeconds) {
if (totalSeconds === null || totalSeconds === undefined) return 'N/A';
const days = Math.floor(totalSeconds / (3600 * 24));
totalSeconds %= (3600 * 24);
const hours = Math.floor(totalSeconds / 3600);
totalSeconds %= 3600;
const minutes = Math.floor(totalSeconds / 60);
const seconds = totalSeconds % 60;
let uptimeString = '';
if (days > 0) uptimeString += `${days}d `;
if (hours > 0) uptimeString += `${hours}h `;
if (minutes > 0) uptimeString += `${minutes}m `;
uptimeString += `${seconds}s`;
return uptimeString.trim() || '0s';
}
async function fetchAndDisplayServerInfo() {
try {
const response = await fetch('/v1/admin/server-info');
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
const serverInfo = await response.json();
// Populate Settings
if(portInputEl) portInputEl.value = serverInfo.port || ''; // Use portInputEl
if(requestTimeoutInputEl) requestTimeoutInputEl.value = serverInfo.requestTimeoutMs !== null ? serverInfo.requestTimeoutMs : '';
if(settingPingIntervalEl) settingPingIntervalEl.textContent = serverInfo.pingIntervalMs !== null ? `${serverInfo.pingIntervalMs} ms` : 'N/A (Not Implemented)';
if (serverInfo.newRequestBehavior === 'drop') {
if(newRequestBehaviorDropEl) newRequestBehaviorDropEl.checked = true;
} else {
if(newRequestBehaviorQueueEl) newRequestBehaviorQueueEl.checked = true; // Default to queue
}
// Populate Status
if(statusUptimeEl) statusUptimeEl.textContent = formatUptime(serverInfo.uptimeSeconds);
if(statusConnectedExtensionsEl) statusConnectedExtensionsEl.textContent = serverInfo.connectedExtensionsCount !== null ? serverInfo.connectedExtensionsCount : 'N/A';
} catch (error) {
console.error('Error fetching server info:', error);
if(portInputEl) portInputEl.value = 'Error';
if(requestTimeoutInputEl) requestTimeoutInputEl.value = 'Error';
if(settingPingIntervalEl) settingPingIntervalEl.textContent = 'Error';
if(newRequestBehaviorQueueEl) newRequestBehaviorQueueEl.checked = true; // Default on error
// Ensure error handling for status elements
if(statusUptimeEl) statusUptimeEl.textContent = 'Error loading uptime';
if(statusConnectedExtensionsEl) statusConnectedExtensionsEl.textContent = 'Error loading connections';
}
}
async function handleSaveSettings() {
if (!requestTimeoutInputEl || !portInputEl || !updateStatusMsgEl || !newRequestBehaviorQueueEl || !newRequestBehaviorDropEl) return;
const newTimeout = parseInt(requestTimeoutInputEl.value, 10);
const newPort = parseInt(portInputEl.value, 10);
const selectedNewRequestBehavior = newRequestBehaviorQueueEl.checked ? 'queue' : 'drop';
let settingsToUpdate = {};
let validationError = false;
let messages = [];
if (requestTimeoutInputEl.value.trim() !== '') { // Only process if there's input
if (!isNaN(newTimeout) && newTimeout > 0) {
settingsToUpdate.requestTimeoutMs = newTimeout;
} else {
messages.push('Invalid timeout: Must be a positive number.');
validationError = true;
}
}
if (portInputEl.value.trim() !== '') { // Only process if there's input
if (!isNaN(newPort) && newPort > 0 && newPort <= 65535) {
settingsToUpdate.port = newPort;
} else {
messages.push('Invalid port: Must be between 1 and 65535.');
validationError = true;
}
}
// Always include newRequestBehavior as it's controlled by radio buttons
// No specific validation needed here as it's either 'queue' or 'drop'
settingsToUpdate.newRequestBehavior = selectedNewRequestBehavior;
if (validationError) {
updateStatusMsgEl.textContent = messages.join(' ');
updateStatusMsgEl.style.color = 'red';
setTimeout(() => { updateStatusMsgEl.textContent = ''; }, 7000);
return;
}
if (Object.keys(settingsToUpdate).length === 0) {
updateStatusMsgEl.textContent = 'No changes to save.';
updateStatusMsgEl.style.color = 'blue';
setTimeout(() => { updateStatusMsgEl.textContent = ''; }, 5000);
return;
}
updateStatusMsgEl.textContent = 'Saving settings...';
updateStatusMsgEl.style.color = 'orange';
try {
const response = await fetch('/v1/admin/update-settings', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(settingsToUpdate)
});
const result = await response.json();
if (response.ok) {
updateStatusMsgEl.textContent = result.message || 'Settings updated successfully!';
updateStatusMsgEl.style.color = 'green';
// The backend message already indicates if restart is needed for port
fetchAndDisplayServerInfo();
} else {
updateStatusMsgEl.textContent = `Error: ${result.error || 'Failed to update settings.'}`;
updateStatusMsgEl.style.color = 'red';
}
} catch (error) {
console.error('Error updating settings:', error);
updateStatusMsgEl.textContent = 'Failed to send update command.';
updateStatusMsgEl.style.color = 'red';
}
// Keep message longer if it mentions restart
const clearTime = updateStatusMsgEl.textContent.toLowerCase().includes('restart') ? 15000 : 7000;
setTimeout(() => { updateStatusMsgEl.textContent = ''; }, clearTime);
}
if (saveSettingsBtn) {
saveSettingsBtn.addEventListener('click', handleSaveSettings);
}
async function handleRestartServer() {
if (confirm('Are you sure you want to restart the server?')) {
try {
const response = await fetch('/v1/admin/restart-server', { method: 'POST' });
const result = await response.json();
alert(result.message || 'Restart command sent.');
} catch (error) {
console.error('Error restarting server:', error);
alert('Failed to send restart command to server.');
}
}
}
if (restartServerBtn) {
restartServerBtn.addEventListener('click', handleRestartServer);
}
// Initial load for settings and status
fetchAndDisplayServerInfo();
console.log("Admin UI initialized. Message history, settings, status, and restart functionality implemented.");
</script>
</body>
</html>

View file

@ -0,0 +1,19 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
// Import the server
require('./server');

View file

@ -0,0 +1,18 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
import './server';

View file

@ -0,0 +1,473 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
const express = require('express');
const bodyParser = require('body-parser');
const cors = require('cors');
const { WebSocketServer } = require('ws');
const http = require('http');
// Create Express app
const app = express();
app.use(cors());
app.use(bodyParser.json({ limit: '50mb' }));
// Health check endpoint
app.get('/health', (req, res) => {
const aliveConnections = activeConnections.filter(conn => conn.isAlive);
res.status(200).json({
status: 'ok',
timestamp: new Date().toISOString(),
activeBrowserConnections: aliveConnections.length,
totalTrackedBrowserConnections: activeConnections.length,
webSocketServerState: wss.options.server.listening ? 'listening' : 'not_listening' // wss.readyState is not standard for server
});
});
// Create HTTP server
const server = http.createServer(app);
// Create WebSocket server for browser extension communication
const wss = new WebSocketServer({ server });
// Global variables
let activeConnections = [];
const pendingRequests = new Map();
let requestCounter = 0;
// Connection health check interval (in milliseconds)
const PING_INTERVAL = 30000; // 30 seconds
const CONNECTION_TIMEOUT = 45000; // 45 seconds
// Handle WebSocket connections from browser extensions
wss.on('connection', (ws, req) => { // Added req to log client IP
const clientIp = req.socket.remoteAddress;
console.log(`SERVER: Browser extension connected from IP: ${clientIp}`);
// Initialize connection state
ws.isAlive = true;
ws.pendingPing = false;
ws.lastActivity = Date.now();
// Add to active connections
activeConnections.push(ws);
// Set up ping interval for this connection
const pingInterval = setInterval(() => {
// Check if connection is still alive
if (!ws.isAlive) {
console.log('Browser extension connection timed out, terminating');
clearInterval(pingInterval);
ws.terminate();
return;
}
// If we're still waiting for a pong from the last ping, mark as not alive
if (ws.pendingPing) {
console.log('Browser extension not responding to ping, marking as inactive');
ws.isAlive = false;
return;
}
// Check if there's been activity recently
const inactiveTime = Date.now() - ws.lastActivity;
if (inactiveTime > CONNECTION_TIMEOUT) {
console.log(`Browser extension inactive for ${inactiveTime}ms, sending ping`);
// Send a ping to check if still alive
ws.pendingPing = true;
try {
ws.ping();
} catch (error) {
console.error('Error sending ping:', error);
ws.isAlive = false;
}
}
}, PING_INTERVAL);
// Handle pong messages (response to ping)
ws.on('pong', () => {
ws.isAlive = true;
ws.pendingPing = false;
ws.lastActivity = Date.now();
console.log('Browser extension responded to ping');
});
// Handle messages from browser extension
ws.on('message', (messageBuffer) => {
const rawMessage = messageBuffer.toString();
console.log(`SERVER: Received raw message from extension (IP: ${clientIp}): ${rawMessage.substring(0, 500)}${rawMessage.length > 500 ? '...' : ''}`);
try {
// Update last activity timestamp
ws.lastActivity = Date.now();
const data = JSON.parse(rawMessage);
console.log(`SERVER: Parsed message data from extension (IP: ${clientIp}):`, data);
const { requestId, type } = data;
if (requestId === undefined) {
console.warn(`SERVER: Received message without requestId from IP ${clientIp}:`, data);
// Handle other non-request-specific messages if any (e.g., status pings initiated by extension)
if (type === 'EXTENSION_STATUS') {
console.log(`SERVER: Browser extension status from IP ${clientIp}: ${data.status}`);
}
return;
}
// Log based on new message types from background.js
if (type === 'CHAT_RESPONSE_CHUNK') {
const chunkContent = data.chunk ? data.chunk.substring(0, 200) + (data.chunk.length > 200 ? '...' : '') : '[empty chunk]';
console.log(`SERVER: Received CHAT_RESPONSE_CHUNK for requestId: ${requestId} from IP ${clientIp}. Chunk (first 200): ${chunkContent}. IsFinal: ${data.isFinal}`);
const pendingRequest = pendingRequests.get(requestId);
if (pendingRequest) {
console.log(`SERVER: Processing CHAT_RESPONSE_CHUNK for pending request ${requestId} from IP ${clientIp}. IsFinal: ${data.isFinal}, Chunk (first 200): ${chunkContent}`);
// Initialize accumulatedChunks if it doesn't exist (should be set on creation)
if (typeof pendingRequest.accumulatedChunks === 'undefined') {
pendingRequest.accumulatedChunks = '';
}
if (data.chunk) { // Ensure chunk is not null or undefined
pendingRequest.accumulatedChunks += data.chunk;
}
if (data.isFinal) {
console.log(`SERVER: Request ${requestId} (IP: ${clientIp}) received final CHAT_RESPONSE_CHUNK. Attempting to resolve promise.`);
if (pendingRequest.timeoutId) {
clearTimeout(pendingRequest.timeoutId);
console.log(`SERVER: Request ${requestId} (IP: ${clientIp}) timeout cleared.`);
}
pendingRequest.resolve(pendingRequest.accumulatedChunks);
pendingRequests.delete(requestId);
console.log(`SERVER: Request ${requestId} (IP: ${clientIp}) promise resolved and removed from pending. Total length: ${pendingRequest.accumulatedChunks.length}`);
} else {
console.log(`SERVER: Accumulated chunk for requestId ${requestId} (IP: ${clientIp}). Current total length: ${pendingRequest.accumulatedChunks.length}`);
}
} else {
console.log(`SERVER: Received CHAT_RESPONSE_CHUNK for request ${requestId} (IP: ${clientIp}, isFinal: ${data.isFinal}), but no pending request found.`);
}
} else if (type === 'CHAT_RESPONSE_STREAM_ENDED') {
const pendingRequestStream = pendingRequests.get(requestId);
if (pendingRequestStream) {
console.log(`SERVER: Processing CHAT_RESPONSE_STREAM_ENDED for pending request ${requestId} (IP: ${clientIp}).`);
// This message type now primarily signals the end. The actual data comes in CHAT_RESPONSE_CHUNK.
// If a request is still pending and we haven't resolved it with a final chunk,
// it might indicate an issue or a stream that ended without complete data.
if (!pendingRequestStream.resolved) {
console.warn(`SERVER: Stream ended for requestId ${requestId} (IP: ${clientIp}), but request was not fully resolved with data. This might be an issue.`);
}
} else {
console.log(`SERVER: Received CHAT_RESPONSE_STREAM_ENDED for request ${requestId} (IP: ${clientIp}), but no pending request found.`);
}
} else if (type === 'CHAT_RESPONSE_ERROR') {
const errorMsg = data.error || "Unknown error from extension.";
console.error(`SERVER: Received CHAT_RESPONSE_ERROR for requestId: ${requestId} (IP: ${clientIp}). Error: ${errorMsg}`);
const pendingRequestError = pendingRequests.get(requestId);
if (pendingRequestError) {
console.log(`SERVER: Processing CHAT_RESPONSE_ERROR for pending request ${requestId} (IP: ${clientIp}).`);
if (pendingRequestError.timeoutId) {
clearTimeout(pendingRequestError.timeoutId);
console.log(`SERVER: Request ${requestId} (IP: ${clientIp}) timeout cleared due to error.`);
}
pendingRequestError.reject(new Error(`Extension reported error for request ${requestId}: ${errorMsg}`));
pendingRequests.delete(requestId);
console.log(`SERVER: Request ${requestId} (IP: ${clientIp}) rejected due to CHAT_RESPONSE_ERROR and removed from pending.`);
} else {
console.log(`SERVER: Received CHAT_RESPONSE_ERROR for request ${requestId} (IP: ${clientIp}), but no pending request found.`);
}
} else if (type === 'CHAT_RESPONSE') { // Keep old CHAT_RESPONSE for compatibility if content script DOM fallback sends it
const { response } = data;
console.log(`SERVER: Received (legacy) CHAT_RESPONSE for requestId: ${requestId} from IP ${clientIp}. Response (first 100): ${response ? response.substring(0,100) : '[empty]'}`);
const pendingRequest = pendingRequests.get(requestId);
if (pendingRequest) {
if (pendingRequest.timeoutId) clearTimeout(pendingRequest.timeoutId);
pendingRequest.resolve(response);
pendingRequests.delete(requestId);
console.log(`SERVER: Request ${requestId} resolved with (legacy) CHAT_RESPONSE from IP ${clientIp}.`);
} else {
console.log(`SERVER: Received (legacy) CHAT_RESPONSE for request ${requestId} from IP ${clientIp}, but no pending request found.`);
}
} else if (type === 'EXTENSION_ERROR') { // General extension error not tied to a request
console.error(`SERVER: Browser extension (IP: ${clientIp}) reported general error: ${data.error}`);
} else if (type === 'EXTENSION_STATUS') {
console.log(`SERVER: Browser extension (IP: ${clientIp}) status: ${data.status}`);
} else {
console.warn(`SERVER: Received unknown message type '${type}' from IP ${clientIp} for requestId ${requestId}:`, data);
}
} catch (error) {
console.error(`SERVER: Error processing WebSocket message from IP ${clientIp}:`, error, `Raw message: ${rawMessage}`);
}
});
// Handle disconnection
ws.on('close', (code, reason) => {
const reasonString = reason ? reason.toString() : 'No reason given';
console.log(`SERVER: Browser extension (IP: ${clientIp}) disconnected. Code: ${code}, Reason: ${reasonString}`);
clearInterval(pingInterval);
activeConnections = activeConnections.filter(conn => conn !== ws);
// Check if there are any pending requests that were using this connection
// and reject them with a connection closed error
pendingRequests.forEach((request, requestId) => {
if (request.connection === ws) {
console.log(`Rejecting request ${requestId} due to connection close`);
request.reject(new Error('Browser extension disconnected'));
pendingRequests.delete(requestId);
}
});
});
// Handle errors
ws.on('error', (error) => {
console.error(`SERVER: WebSocket error for connection from IP ${clientIp}:`, error);
ws.isAlive = false; // Mark as not alive on error
// Consider terminating and cleaning up like in 'close' if error is fatal
});
});
// Create API router
const apiRouter = express.Router();
// Configuration
const REQUEST_TIMEOUT = 300000; // 5 minutes (in milliseconds)
const MAX_RETRIES = 2; // Maximum number of retries for a failed request
// Helper function to find the best active connection
function getBestConnection() {
// Filter out connections that are not alive
const aliveConnections = activeConnections.filter(conn => conn.isAlive);
if (aliveConnections.length === 0) {
return null;
}
// Sort connections by last activity (most recent first)
aliveConnections.sort((a, b) => b.lastActivity - a.lastActivity);
return aliveConnections[0];
}
// OpenAI-compatible chat completions endpoint
apiRouter.post('/chat/completions', async (req, res) => {
try {
const { messages, model, temperature, max_tokens } = req.body;
console.log(`SERVER: Full incoming HTTP request body for request ID (to be generated):`, JSON.stringify(req.body, null, 2));
// Generate a unique request ID
const requestId = requestCounter++;
// Extract the user's message (last message in the array)
const userMessage = messages[messages.length - 1].content;
// Get the best active connection
const extension = getBestConnection();
// Check if we have any active connections
if (!extension) {
return res.status(503).json({
error: {
message: "No active browser extension connected. Please open the chat interface and ensure the extension is active.",
type: "server_error",
code: "no_extension_connected"
}
});
}
// Create a promise that will be resolved when the response is received
console.log(`SERVER: Request ${requestId} creating response promise.`);
const responsePromise = new Promise((resolve, reject) => {
const internalResolve = (value) => {
console.log(`SERVER: Request ${requestId} internal promise resolve function called.`);
resolve(value);
};
const internalReject = (reason) => {
console.log(`SERVER: Request ${requestId} internal promise reject function called.`);
reject(reason);
};
// Set a timeout to reject the promise after the configured timeout
const timeoutId = setTimeout(() => {
if (pendingRequests.has(requestId)) {
console.error(`SERVER: Request ${requestId} timed out after ${REQUEST_TIMEOUT}ms. Rejecting promise.`);
pendingRequests.delete(requestId); // Ensure cleanup
internalReject(new Error('Request timed out'));
} else {
console.warn(`SERVER: Request ${requestId} timeout triggered, but request no longer in pendingRequests. It might have resolved or errored just before timeout.`);
}
}, REQUEST_TIMEOUT);
// Store the promise resolvers and the connection being used
pendingRequests.set(requestId, {
resolve: internalResolve,
reject: internalReject,
connection: extension,
timeoutId,
retryCount: 0,
accumulatedChunks: '' // Initialize for chunk accumulation
});
console.log(`SERVER: Request ${requestId} added to pendingRequests. Timeout ID: ${timeoutId}`);
});
// Prepare the message
const message = {
type: 'SEND_CHAT_MESSAGE',
requestId,
message: userMessage,
settings: {
model,
temperature,
max_tokens
}
};
// Send the message to the browser extension
try {
console.log(`SERVER: Request ${requestId} - Sending full message to extension:`, JSON.stringify(message, null, 2));
extension.send(JSON.stringify(message));
console.log(`SERVER: Request ${requestId} (message type: ${message.type}) sent to browser extension (IP: ${extension.remoteAddress || 'unknown'}). Waiting for response...`);
// Update last activity timestamp
extension.lastActivity = Date.now();
} catch (error) {
console.error(`Error sending message to extension for request ${requestId}:`, error);
// Clean up the pending request
if (pendingRequests.has(requestId)) {
const pendingRequest = pendingRequests.get(requestId);
if (pendingRequest.timeoutId) {
clearTimeout(pendingRequest.timeoutId);
}
pendingRequests.delete(requestId);
}
return res.status(500).json({
error: {
message: "Failed to send message to browser extension",
type: "server_error",
code: "extension_communication_error"
}
});
}
// Wait for the response
const awaitStartTime = Date.now();
console.log(`SERVER: Request ${requestId} is now awaiting responsePromise (extension response). Timeout set to ${REQUEST_TIMEOUT}ms.`);
const response = await responsePromise;
const awaitEndTime = Date.now();
console.log(`SERVER: Request ${requestId} await responsePromise completed in ${awaitEndTime - awaitStartTime}ms. Received response from extension. Preparing to send to client.`);
// Format the response in OpenAI format
const formatStartTime = Date.now();
const formattedResponse = {
id: `chatcmpl-${Date.now()}`,
object: "chat.completion",
created: Math.floor(Date.now() / 1000),
model: model || "relay-model", // model is from req.body
choices: [
{
index: 0,
message: {
role: "assistant",
content: response // response is the string from the extension
},
finish_reason: "stop"
}
],
usage: {
prompt_tokens: -1, // We don't track tokens
completion_tokens: -1,
total_tokens: -1
}
// Removed service_tier, logprobs, refusal, annotations, and detailed usage to match simpler working version
};
console.log(`SERVER: Request ${requestId} - Full outgoing HTTP response body:`, JSON.stringify(formattedResponse, null, 2));
res.json(formattedResponse);
const sendEndTime = Date.now();
console.log(`SERVER: Request ${requestId} formatted and sent response to client in ${sendEndTime - formatStartTime}ms (total after await: ${sendEndTime - awaitEndTime}ms).`);
} catch (error) {
const reqIdForLog = typeof requestId !== 'undefined' ? requestId : (error && typeof error.requestId !== 'undefined' ? error.requestId : 'UNKNOWN');
console.error(`SERVER: Error processing chat completion for request ${reqIdForLog}:`, error);
if (typeof requestId === 'undefined') {
console.error(`SERVER: CRITICAL - 'requestId' was undefined in catch block. Error object requestId: ${error && error.requestId}`);
}
// Determine the appropriate status code based on the error
let statusCode = 500;
let errorType = "server_error";
let errorCode = "internal_error";
if (error.message === 'Request timed out') {
statusCode = 504; // Gateway Timeout
errorType = "timeout_error";
errorCode = "request_timeout";
} else if (error.message === 'Browser extension disconnected') {
statusCode = 503; // Service Unavailable
errorType = "server_error";
errorCode = "extension_disconnected";
}
const errorResponsePayload = {
error: {
message: error.message,
type: errorType,
code: errorCode
}
};
console.log(`SERVER: Request ${reqIdForLog} - Sending error response to client:`, JSON.stringify(errorResponsePayload, null, 2));
res.status(statusCode).json(errorResponsePayload);
}
});
// Models endpoint
apiRouter.get('/models', (req, res) => {
res.json({
object: "list",
data: [
{
id: "gemini-pro",
object: "model",
created: 1677610602,
owned_by: "relay"
},
{
id: "chatgpt",
object: "model",
created: 1677610602,
owned_by: "relay"
},
{
id: "claude-3",
object: "model",
created: 1677610602,
owned_by: "relay"
}
]
});
});
// Mount the API router
app.use('/v1', apiRouter);
// Start the server
const PORT = process.env.PORT || 3003;
server.listen(PORT, () => {
console.log(`OpenAI-compatible relay server running on port ${PORT}`);
console.log(`WebSocket server for browser extensions running on ws://localhost:${PORT}`);
});
module.exports = server;

View file

@ -0,0 +1,676 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
import express, { Request, Response, NextFunction, Router } from 'express';
import bodyParser from 'body-parser';
import cors from 'cors';
import { WebSocketServer, WebSocket } from 'ws';
import http from 'http';
import path from 'path';
import fs from 'fs';
// Interfaces
interface PendingRequest {
resolve: (value: any) => void;
reject: (reason?: any) => void;
}
interface WebSocketMessage {
type: string;
requestId?: number;
message?: string; // For messages sent from server to extension
response?: string; // For CHAT_RESPONSE from extension (older DOM method)
chunk?: string; // For CHAT_RESPONSE_CHUNK from extension (debugger method)
isFinal?: boolean; // Flag for CHAT_RESPONSE_CHUNK
error?: string; // For CHAT_RESPONSE_ERROR from extension
settings?: {
model?: string;
temperature?: number;
max_tokens?: number;
};
}
// Queuing/Dropping System State
let activeExtensionProcessingId: number | null = null;
// newRequestBehavior will be initialized after loadServerConfig()
let newRequestBehavior: 'queue' | 'drop';
interface QueuedRequest {
requestId: number;
req: Request; // Express Request object
res: Response; // Express Response object
userMessage: string;
model?: string;
temperature?: number;
max_tokens?: number;
}
let requestQueue: QueuedRequest[] = [];
// Global variables
let activeConnections: WebSocket[] = [];
const pendingRequests = new Map<number, PendingRequest>();
let requestCounter = 0;
// In-memory store for admin messages
interface ModelSettings {
model?: string;
temperature?: number;
max_tokens?: number;
}
interface ChatRequestData {
fromClient: string;
toExtension: WebSocketMessage; // Assuming WebSocketMessage is defined elsewhere
modelSettings: ModelSettings;
}
interface ChatResponseData {
fromExtension: string;
toClient: any; // This could be more specific if the OpenAI response structure is defined
status: string;
}
interface ChatErrorData {
toClientError: any; // This could be more specific if the error JSON structure is defined
status: string;
}
type AdminLogDataType = ChatRequestData | ChatResponseData | ChatErrorData | any; // Fallback to any for other types
interface AdminLogEntry {
timestamp: string;
type:
| 'CHAT_REQUEST_RECEIVED'
| 'CHAT_RESPONSE_SENT'
| 'CHAT_ERROR_RESPONSE_SENT'
| 'CHAT_REQUEST_QUEUED'
| 'CHAT_REQUEST_DROPPED'
| 'CHAT_REQUEST_DEQUEUED'
| 'CHAT_REQUEST_PROCESSING'
| 'CHAT_REQUEST_ERROR' // For pre-processing errors like no extension
| 'SETTING_UPDATE' // Existing type, ensure it's included
| string; // Fallback for other/future types
requestId: string;
data: AdminLogDataType;
}
const MAX_ADMIN_HISTORY_LENGTH = 1000;
let adminMessageHistory: AdminLogEntry[] = [];
const serverStartTime = Date.now(); // Store server start time
// Configuration file path
const CONFIG_FILE_PATH = path.join(__dirname, 'server-config.json');
const RESTART_TRIGGER_FILE_PATH = path.join(__dirname, '.triggerrestart'); // For explicitly triggering nodemon
interface ServerConfig {
port?: number;
requestTimeoutMs?: number;
lastRestartRequestTimestamp?: number; // New field
newRequestBehavior?: 'queue' | 'drop';
}
// Function to read configuration
function loadServerConfig(): ServerConfig {
try {
if (fs.existsSync(CONFIG_FILE_PATH)) {
const configFile = fs.readFileSync(CONFIG_FILE_PATH, 'utf-8');
return JSON.parse(configFile) as ServerConfig;
}
} catch (error) {
console.error('Error reading server-config.json, using defaults/env vars:', error);
}
return {};
}
// Function to write configuration
function saveServerConfig(config: ServerConfig): void {
try {
fs.writeFileSync(CONFIG_FILE_PATH, JSON.stringify(config, null, 2), 'utf-8');
console.log('Server configuration saved to server-config.json');
} catch (error) {
console.error('Error writing server-config.json:', error);
}
}
// Load initial config
const initialConfig = loadServerConfig();
// Initialize newRequestBehavior from config, defaulting to 'queue'
newRequestBehavior = initialConfig.newRequestBehavior && (initialConfig.newRequestBehavior === 'queue' || initialConfig.newRequestBehavior === 'drop')
? initialConfig.newRequestBehavior
: 'queue';
const PORT = initialConfig.port || parseInt(process.env.PORT || '3003', 10);
let currentRequestTimeoutMs = initialConfig.requestTimeoutMs || parseInt(process.env.REQUEST_TIMEOUT_MS || '120000', 10);
// Create Express app
const app = express();
app.use(cors());
app.use(bodyParser.json({ limit: '10mb' })); // Increased payload size limit
// Admin UI: Serve static files
// Correct path considering TypeScript's outDir. __dirname will be 'dist' at runtime.
const adminUIDirectory = path.join(__dirname, '../src/admin-ui');
app.use('/admin-static', express.static(adminUIDirectory));
// Admin UI: Route for the main admin page
app.get('/admin', (req: Request, res: Response) => {
res.sendFile(path.join(adminUIDirectory, 'admin.html'));
});
// Create HTTP server
const server = http.createServer(app);
// Create WebSocket server for browser extension communication
const wss = new WebSocketServer({ server });
// Handle WebSocket connections from browser extensions
wss.on('connection', (ws: WebSocket) => {
console.log('Browser extension connected');
activeConnections.push(ws);
// Handle messages from browser extension
ws.on('message', (message: string) => {
try {
const data: WebSocketMessage = JSON.parse(message.toString());
let requestIdToProcess: number | undefined = undefined;
let responseDataToUse: string | undefined = undefined;
let isErrorMessage = false;
console.log(`SERVER: WebSocket message received from extension: type=${data.type}, requestId=${data.requestId}`);
if (data.type === 'CHAT_RESPONSE') {
requestIdToProcess = data.requestId;
responseDataToUse = data.response;
console.log(`SERVER: Processing CHAT_RESPONSE for requestId: ${data.requestId}`);
} else if (data.type === 'CHAT_RESPONSE_CHUNK' && data.isFinal === true) {
requestIdToProcess = data.requestId;
responseDataToUse = data.chunk;
console.log(`SERVER: Processing final CHAT_RESPONSE_CHUNK for requestId: ${data.requestId}`);
} else if (data.type === 'CHAT_RESPONSE_ERROR') {
requestIdToProcess = data.requestId;
responseDataToUse = data.error || "Unknown error from extension";
isErrorMessage = true;
console.log(`SERVER: Processing CHAT_RESPONSE_ERROR for requestId: ${data.requestId}`);
} else if (data.type === 'CHAT_RESPONSE_STREAM_ENDED') {
// This message type currently doesn't carry the final data itself in background.js,
// the CHAT_RESPONSE_CHUNK with isFinal=true does.
// So, we just log it. The promise should be resolved by the final CHUNK.
console.log(`SERVER: Received CHAT_RESPONSE_STREAM_ENDED for requestId: ${data.requestId}. No action taken as final data comes in CHUNK.`);
return;
} else {
console.log(`SERVER: Received unhandled WebSocket message type: ${data.type} for requestId: ${data.requestId}`);
return;
}
if (requestIdToProcess !== undefined) {
const pendingRequest = pendingRequests.get(requestIdToProcess);
if (pendingRequest) {
if (isErrorMessage) {
console.error(`SERVER: Rejecting request ${requestIdToProcess} with error: ${responseDataToUse}`);
pendingRequest.reject(new Error(responseDataToUse || "Error from extension"));
} else {
console.log(`SERVER: Resolving request ${requestIdToProcess} with data (first 100 chars): ${(responseDataToUse || "").substring(0,100)}`);
pendingRequest.resolve(responseDataToUse);
}
pendingRequests.delete(requestIdToProcess);
console.log(`SERVER: Request ${requestIdToProcess} ${isErrorMessage ? 'rejected' : 'resolved'} and removed from pending.`);
} else {
console.warn(`SERVER: Received response for unknown or timed-out requestId: ${requestIdToProcess}. No pending request found.`);
}
} else {
// This case should ideally not be reached if the above 'if/else if' for types is exhaustive for messages carrying a requestId.
console.warn(`SERVER: Received WebSocket message but could not determine requestId to process. Type: ${data.type}, Full Data:`, data);
}
} catch (error) {
console.error('SERVER: Error processing WebSocket message:', error, 'Raw message:', message.toString());
}
});
// Handle disconnection
ws.on('close', () => {
console.log('Browser extension disconnected');
activeConnections = activeConnections.filter(conn => conn !== ws);
});
});
// Function to log admin messages to in-memory store
async function logAdminMessage(
type: AdminLogEntry['type'], // Use the more specific type from AdminLogEntry
requestId: string | number,
data: AdminLogDataType // Use the specific union type for data
): Promise<void> {
const timestamp = new Date().toISOString();
// For debugging, let's log what's being passed to logAdminMessage
// console.log(`LOGGING [${type}] ReqID [${requestId}]:`, JSON.stringify(data, null, 2));
const logEntry: AdminLogEntry = {
timestamp,
type,
requestId: String(requestId),
data,
};
adminMessageHistory.unshift(logEntry);
if (adminMessageHistory.length > MAX_ADMIN_HISTORY_LENGTH) {
adminMessageHistory = adminMessageHistory.slice(0, MAX_ADMIN_HISTORY_LENGTH);
}
}
// Define processRequest
async function processRequest(queuedItem: QueuedRequest): Promise<void> {
const { requestId, req, res, userMessage, model, temperature, max_tokens } = queuedItem;
activeExtensionProcessingId = requestId;
logAdminMessage('CHAT_REQUEST_PROCESSING', requestId, { status: 'Sending to extension', activeExtensionProcessingId })
.catch(err => console.error("ADMIN_LOG_ERROR (CHAT_REQUEST_PROCESSING):", err));
console.log(`SERVER: Processing request ${requestId}. ActiveExtensionProcessingId: ${activeExtensionProcessingId}`);
try {
if (activeConnections.length === 0) {
console.error(`SERVER: No active extension connection for request ${requestId} during processing.`);
logAdminMessage('CHAT_REQUEST_ERROR', requestId, {
reason: "No extension connected during processing attempt",
activeExtensionProcessingId
}).catch(err => console.error("ADMIN_LOG_ERROR (CHAT_REQUEST_ERROR):", err));
if (!res.headersSent) {
res.status(503).json({
error: {
message: "No browser extension connected when attempting to process request.",
type: "server_error",
code: "no_extension_during_processing"
}
});
}
return; // Exit early, finally block will call finishProcessingRequest
}
const responsePromise = new Promise<string>((resolve, reject) => {
pendingRequests.set(requestId, { resolve, reject });
setTimeout(() => {
if (pendingRequests.has(requestId)) {
pendingRequests.delete(requestId);
console.log(`SERVER: Request ${requestId} timed out after ${currentRequestTimeoutMs}ms during active processing. Rejecting promise.`);
reject(new Error('Request timed out during active processing'));
}
}, currentRequestTimeoutMs);
});
const extension = activeConnections[0];
const messageToExtension: WebSocketMessage = {
type: 'SEND_CHAT_MESSAGE',
requestId,
message: userMessage,
settings: { model, temperature, max_tokens }
};
extension.send(JSON.stringify(messageToExtension));
console.log(`SERVER: Request ${requestId} sent to browser extension via processRequest.`);
const responseData = await responsePromise;
const formattedResponse = {
id: `chatcmpl-${Date.now()}`,
object: "chat.completion",
created: Math.floor(Date.now() / 1000),
model: model || "relay-model",
choices: [{ index: 0, message: { role: "assistant", content: responseData }, finish_reason: "stop" }],
usage: { prompt_tokens: -1, completion_tokens: -1, total_tokens: -1 }
};
logAdminMessage('CHAT_RESPONSE_SENT', requestId, {
fromExtension: responseData,
toClient: formattedResponse,
status: "Success (processed)"
}).catch(err => console.error("ADMIN_LOG_ERROR (CHAT_RESPONSE_SENT):", err));
console.log(`SERVER: Request ${requestId} - Sending formatted response to client from processRequest.`);
if (!res.headersSent) {
res.json(formattedResponse);
}
} catch (error: any) {
console.error(`SERVER: Error in processRequest for ${requestId}:`, error);
const errorResponseJson = {
message: error.message || "Unknown error during request processing.",
type: "server_error",
code: error.message === 'Request timed out during active processing' ? "timeout_during_processing" : "processing_error",
requestId
};
logAdminMessage('CHAT_ERROR_RESPONSE_SENT', requestId, {
toClientError: errorResponseJson,
status: `Error in processRequest: ${error.message}`
}).catch(err => console.error("ADMIN_LOG_ERROR (CHAT_ERROR_RESPONSE_SENT):", err));
if (!res.headersSent) {
res.status(500).json({ error: errorResponseJson });
}
} finally {
finishProcessingRequest(requestId);
}
}
function finishProcessingRequest(completedRequestId: number): void {
activeExtensionProcessingId = null;
pendingRequests.delete(completedRequestId);
console.log(`SERVER: Processing finished for requestId: ${completedRequestId}. Extension is now free.`);
if (newRequestBehavior === 'queue' && requestQueue.length > 0) {
const nextRequest = requestQueue.shift();
if (nextRequest) {
console.log(`SERVER: Dequeuing request ${nextRequest.requestId}. Queue length: ${requestQueue.length}`);
logAdminMessage('CHAT_REQUEST_DEQUEUED', nextRequest.requestId, {
queueLength: requestQueue.length,
dequeuedRequestId: nextRequest.requestId
}).catch(err => console.error("ADMIN_LOG_ERROR (CHAT_REQUEST_DEQUEUED):", err));
processRequest(nextRequest).catch((error: Error) => {
console.error(`SERVER: Error processing dequeued request ${nextRequest.requestId}:`, error);
if (!nextRequest.res.headersSent) {
nextRequest.res.status(500).json({
error: {
message: `Failed to process dequeued request: ${error.message || 'Unknown error'}`,
type: "server_error",
code: "dequeued_request_processing_failed",
requestId: nextRequest.requestId
}
});
}
});
}
}
}
// Create API router
const apiRouter: Router = express.Router();
// OpenAI-compatible chat completions endpoint
apiRouter.post('/chat/completions', async (req: Request, res: Response): Promise<void> => {
const requestId = requestCounter++;
const { messages, model, temperature, max_tokens } = req.body;
const userMessage = messages[messages.length - 1].content;
// Log initial receipt and intended action
let initialActionLog = 'DirectProcessing';
if (activeExtensionProcessingId !== null) {
initialActionLog = newRequestBehavior === 'queue' ? 'AttemptQueue' : 'AttemptDrop';
}
logAdminMessage('CHAT_REQUEST_RECEIVED', requestId, {
fromClient: userMessage,
modelSettings: { model, temperature, max_tokens },
initialAction: initialActionLog, // Use the determined log value
currentActiveExtensionProcessingId: activeExtensionProcessingId,
newRequestBehaviorSetting: newRequestBehavior
}).catch(err => console.error("ADMIN_LOG_ERROR (CHAT_REQUEST_RECEIVED):", err));
console.log(`SERVER: Request ${requestId} received. Initial Action: ${initialActionLog}. Active ID: ${activeExtensionProcessingId}, Behavior: ${newRequestBehavior}`);
if (activeConnections.length === 0) {
logAdminMessage('CHAT_REQUEST_ERROR', requestId, {
reason: "No extension connected at time of request",
clientMessage: userMessage,
details: "Response 503 sent to client."
}).catch(err => console.error("ADMIN_LOG_ERROR (CHAT_REQUEST_ERROR):", err));
console.log(`SERVER: Request ${requestId} - No browser extension connected. Responding 503.`);
if (!res.headersSent) {
res.status(503).json({
error: {
message: "No browser extension connected. Please open the chat interface and ensure the extension is active.",
type: "server_error",
code: "no_extension_connected"
}
});
}
return;
}
const queuedItem: QueuedRequest = {
requestId,
req,
res,
userMessage,
model,
temperature,
max_tokens
};
if (activeExtensionProcessingId !== null) { // Extension is busy
if (newRequestBehavior === 'drop') {
logAdminMessage('CHAT_REQUEST_DROPPED', requestId, {
reason: "Extension busy",
droppedForRequestId: activeExtensionProcessingId,
clientMessage: userMessage,
details: "Response 429 sent to client."
}).catch(err => console.error("ADMIN_LOG_ERROR (CHAT_REQUEST_DROPPED):", err));
console.log(`SERVER: Request ${requestId} dropped as extension is busy with ${activeExtensionProcessingId}.`);
if (!res.headersSent) {
res.status(429).json({
error: {
message: "Too Many Requests: Extension is currently busy. Please try again later.",
type: "client_error",
code: "extension_busy_request_dropped"
}
});
}
return;
}
if (newRequestBehavior === 'queue') {
requestQueue.push(queuedItem);
logAdminMessage('CHAT_REQUEST_QUEUED', requestId, {
queuePosition: requestQueue.length,
queuedForRequestId: activeExtensionProcessingId,
clientMessage: userMessage,
queueLength: requestQueue.length
}).catch(err => console.error("ADMIN_LOG_ERROR (CHAT_REQUEST_QUEUED):", err));
console.log(`SERVER: Request ${requestId} queued. Position: ${requestQueue.length}. Extension busy with: ${activeExtensionProcessingId}`);
// Do NOT send a response yet, the 'res' object is stored in the queue.
return;
}
}
// If extension is free (activeExtensionProcessingId is null)
// processRequest will handle its own errors and responses including calling res.json() or res.status().json()
processRequest(queuedItem).catch(error => {
// This catch is a safety net if processRequest itself throws an unhandled error *before* it can send a response.
console.error(`SERVER: Unhandled error from processRequest for ${requestId} in /chat/completions:`, error);
logAdminMessage('CHAT_ERROR_RESPONSE_SENT', requestId, {
toClientError: { message: (error as Error).message, type: "server_error", code: "unhandled_processing_catch" },
status: `Error: ${(error as Error).message}`
}).catch(err => console.error("ADMIN_LOG_ERROR (CHAT_ERROR_RESPONSE_SENT):", err));
if (!res.headersSent) {
res.status(500).json({
error: {
message: `Internal server error during request processing: ${(error as Error).message || 'Unknown error'}`,
type: "server_error",
code: "unhandled_processing_error_in_handler",
requestId: requestId
}
});
}
});
});
// Models endpoint
apiRouter.get('/models', (req: Request, res: Response, next: NextFunction) => {
try {
res.json({
object: "list",
data: [
{
id: "gemini-pro",
object: "model",
created: 1677610602,
owned_by: "relay"
},
{
id: "claude-3",
object: "model",
created: 1677610602,
owned_by: "relay"
}
]
});
} catch (error) {
next(error);
}
});
// Endpoint to retrieve message history for Admin UI
apiRouter.get('/admin/message-history', (req: Request, res: Response): void => { // No longer async
try {
// Return the latest 100 entries (or fewer if less than 100 exist)
const historyToReturn = adminMessageHistory.slice(0, 100);
res.json(historyToReturn); // Objects are already parsed
} catch (error) {
console.error('Error fetching message history from in-memory store:', error);
if (!res.headersSent) {
res.status(500).json({
error: {
message: (error instanceof Error ? error.message : String(error)) || 'Failed to retrieve message history',
type: 'server_error', // Changed from redis_error
code: 'history_retrieval_failed'
}
});
}
}
});
// Endpoint to provide server configuration and status
apiRouter.get('/admin/server-info', (req: Request, res: Response): void => {
try {
const uptimeSeconds = Math.floor((Date.now() - serverStartTime) / 1000);
const serverInfo = {
port: PORT,
requestTimeoutMs: currentRequestTimeoutMs, // Report the current mutable value
newRequestBehavior: newRequestBehavior, // Add the current behavior
pingIntervalMs: null, // Placeholder - No explicit ping interval defined for client pings
connectedExtensionsCount: activeConnections.length,
uptimeSeconds: uptimeSeconds,
};
res.json(serverInfo);
} catch (error) {
console.error('Error fetching server info:', error);
if (!res.headersSent) {
res.status(500).json({
error: {
message: (error instanceof Error ? error.message : String(error)) || 'Failed to retrieve server info',
type: 'server_error',
code: 'server_info_failed'
}
});
}
}
});
// Endpoint to restart the server
apiRouter.post('/admin/restart-server', (req: Request, res: Response): void => {
console.log('ADMIN: Received request to restart server.');
// Removed premature res.json() call that was here.
// Log absolute paths for debugging
const absoluteConfigPath = path.resolve(CONFIG_FILE_PATH);
const absoluteTriggerPath = path.resolve(RESTART_TRIGGER_FILE_PATH);
console.log(`ADMIN: Config file path (absolute): ${absoluteConfigPath}`);
console.log(`ADMIN: Trigger file path (absolute): ${absoluteTriggerPath}`);
try {
// 1. Update and save server-config.json
const configToSave = loadServerConfig();
configToSave.lastRestartRequestTimestamp = Date.now();
saveServerConfig(configToSave); // This function already has its own try/catch and logs
console.log('ADMIN: Attempted to update server-config.json.');
// 2. Explicitly touch/create the .triggerrestart file.
try {
fs.writeFileSync(RESTART_TRIGGER_FILE_PATH, Date.now().toString(), 'utf-8');
console.log(`ADMIN: Successfully wrote to restart trigger file: ${absoluteTriggerPath}`);
} catch (triggerFileError) {
console.error(`ADMIN: FAILED to write restart trigger file at ${absoluteTriggerPath}:`, triggerFileError);
}
} catch (outerError) {
// This catch is for errors in loadServerConfig or if saveServerConfig itself throws unexpectedly
console.error('ADMIN: Error in outer try block during restart sequence (e.g., loading config):', outerError);
}
// 3. Send response to client
res.status(200).json({ message: 'Server restart initiated. Nodemon should pick up file changes.' });
// 4. Exit the process after a longer delay.
setTimeout(() => {
console.log('ADMIN: Exiting process for nodemon to restart.');
process.exit(0);
}, 1500); // Increased delay to 1.5 seconds
});
// The more comprehensive update-settings endpoint below handles both port and requestTimeoutMs.
apiRouter.post('/admin/update-settings', (req: Request, res: Response): void => {
const { requestTimeoutMs, port, newRequestBehavior: newBehaviorValue } = req.body;
let configChanged = false;
let messages: string[] = [];
const currentConfig = loadServerConfig(); // Load current disk config to preserve other settings
if (requestTimeoutMs !== undefined) {
const newTimeout = parseInt(String(requestTimeoutMs), 10);
if (!isNaN(newTimeout) && newTimeout > 0) {
currentRequestTimeoutMs = newTimeout; // Update in-memory value immediately
currentConfig.requestTimeoutMs = newTimeout; // Update config for saving
configChanged = true;
messages.push(`Request timeout updated in memory to ${currentRequestTimeoutMs}ms. This change is effective immediately.`);
logAdminMessage('SETTING_UPDATE', 'SERVER_CONFIG', { setting: 'requestTimeoutMs', value: currentRequestTimeoutMs })
.catch(err => console.error("ADMIN_LOG_ERROR (SETTING_UPDATE):", err));
} else {
res.status(400).json({ error: 'Invalid requestTimeoutMs value. Must be a positive number.' });
return;
}
}
if (port !== undefined) {
const newPort = parseInt(String(port), 10);
if (!isNaN(newPort) && newPort > 0 && newPort <= 65535) {
currentConfig.port = newPort; // Update config for saving
configChanged = true;
messages.push(`Server port configured to ${newPort}. This change will take effect after server restart.`);
logAdminMessage('SETTING_UPDATE', 'SERVER_CONFIG', { setting: 'port', value: newPort, requiresRestart: true })
.catch(err => console.error("ADMIN_LOG_ERROR (SETTING_UPDATE):", err));
} else {
res.status(400).json({ error: 'Invalid port value. Must be a positive number between 1 and 65535.' });
return;
}
}
if (newBehaviorValue !== undefined) {
if (newBehaviorValue === 'queue' || newBehaviorValue === 'drop') {
newRequestBehavior = newBehaviorValue; // Update in-memory value immediately
currentConfig.newRequestBehavior = newBehaviorValue; // Update config for saving
configChanged = true;
messages.push(`New request behavior updated to '${newRequestBehavior}'. This change is effective immediately.`);
logAdminMessage('SETTING_UPDATE', 'SERVER_CONFIG', { setting: 'newRequestBehavior', value: newRequestBehavior })
.catch(err => console.error("ADMIN_LOG_ERROR (SETTING_UPDATE newRequestBehavior):", err));
} else {
res.status(400).json({ error: "Invalid newRequestBehavior value. Must be 'queue' or 'drop'." });
return;
}
}
if (configChanged) {
saveServerConfig(currentConfig);
res.json({ message: messages.join(' ') });
} else {
res.status(400).json({ error: 'No valid settings provided or no changes made.' });
}
});
// Health check endpoint
app.get('/health', (req: Request, res: Response) => {
res.status(200).json({ status: 'OK', activeBrowserConnections: activeConnections.length });
});
// Mount the API router
app.use('/v1', apiRouter);
// Start the server
server.listen(PORT, () => {
console.log(`OpenAI-compatible relay server running on port ${PORT}`);
console.log(`WebSocket server for browser extensions running on ws://localhost:${PORT}`);
});

View file

@ -0,0 +1,19 @@
{
"compilerOptions": {
"target": "ES2020",
"module": "CommonJS",
"moduleResolution": "node",
"outDir": "./dist",
"rootDir": "./src",
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true,
"forceConsistentCasingInFileNames": true
},
"include": [
"src/**/*"
],
"exclude": [
"node_modules"
]
}

99
docs/background-script.md Normal file
View file

@ -0,0 +1,99 @@
# Background Script Architecture (`background.js`)
This document explains the structure and behavior of the [`background.js`](extension/background.js:1) script, which acts as the coordination layer between the browser, content scripts, and the external WebSocket relay server.
---
## 🧩 Overview
The background script performs five key functions:
1. Connects to a central WebSocket relay server.
2. Handles incoming `SEND_CHAT_MESSAGE` commands.
3. Determines the best tab to route each command to.
4. Forwards commands to `content.js`.
5. Relays back any errors to the server.
---
## ⚙️ WebSocket Lifecycle
### Startup
- Load settings via [`loadSettingsAndConnect()`](extension/background.js:28)
- Check health endpoint: [`/health`](extension/background.js:51)
- If OK, start [`attemptWebSocketConnection()`](extension/background.js:76)
### Connection Events
- `onopen`: Connection established
- `onmessage`: Handle relay command (e.g., [`SEND_CHAT_MESSAGE`](extension/background.js:92))
- `onclose`: Clean up, schedule reconnect
- `onerror`: Logged but not retried directly
---
## 🔄 Message Flow
```mermaid
sequenceDiagram
participant Server
participant Background
participant ContentScript
Server->>Background: SEND_CHAT_MESSAGE (requestId)
Background->>ContentScript: Forward command with requestId
ContentScript-->>Background: ACK or Error
Background-->>Server: CHAT_RESPONSE or CHAT_RESPONSE_ERROR
```
---
## 📌 State
- [`relaySocket`](extension/background.js:10): Active WebSocket
- [`activeTabId`](extension/background.js:13): Last successful tab
- [`lastRequestId`](extension/background.js:15): For current message
- [`processingRequest`](extension/background.js:16): Concurrency flag
- [`pendingRequests`](extension/background.js:17): Queue for future
---
## 🔧 Command Handling
- Incoming message: [`onmessage`](extension/background.js:88)
- If type is `SEND_CHAT_MESSAGE`, it clears [`pendingRequests`](extension/background.js:101), updates globals, and forwards to content
- [`forwardCommandToContentScript()`](extension/background.js:145) pings known tab or scans for valid tab (`chatgpt.com`, `aistudio`, etc.)
- Sets `lastKnownRequestId` in [`debuggerAttachedTabs`](extension/background.js:24)
---
## 🛡️ Error Handling
- If no tab found or failure: sends [`CHAT_RESPONSE_ERROR`](extension/background.js:203)
- If `chrome.runtime.lastError`: fallback and log
- All retry logic is bounded and observable
---
## 🧠 Debugger Coordination
Maintains a `debuggerAttachedTabs` map keyed by tabId, storing:
```js
{
providerName,
patterns,
isFetchEnabled,
isAttached,
lastKnownRequestId
}
```
Used to route debugger data back to correct content script or relay logic.
---
## ✅ Summary
[`background.js`](extension/background.js:1) is the broker between remote commands and browser tabs. It ensures reliability via health checks, dynamic tab discovery, request tracking, and robust error handling.

17
docs/chatrelay-mode.json Normal file

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1,242 @@
# Consolidated Provider Documentation
This document provides a comprehensive overview of the provider architecture used within the extension, including the utility script for managing providers and detailed descriptions of individual providers like AI Studio and ChatGPT.
---
## 1. Provider Utils (`provider-utils.js`)
The [`provider-utils.js`](extension/providers/provider-utils.js:1) script is crucial for managing provider registration and enabling dynamic lookup based on the current website domain.
### 🧩 Overview
This module is injected into the global `window` object as `window.providerUtils` and offers two core functions:
- [`registerProvider()`](extension/providers/provider-utils.js:7): Registers a provider instance with one or more domains.
- [`detectProvider()`](extension/providers/provider-utils.js:19): Looks up a provider instance based on the current hostname.
### 🌍 Provider Registry
Internally, the registry is held in:
```js
const providerMap = {}; // domain -> { name, instance }
```
Providers are registered like:
```js
registerProvider("AIStudioProvider", ["aistudio.google.com"], new AIStudioProvider());
```
This allows matching providers to be reused across multiple domains if necessary.
### 🔍 Provider Detection
The function [`detectProvider(hostname)`](extension/providers/provider-utils.js:19) performs a partial match against the registered `domainKey`s to determine the best match.
If no match is found, it returns `null` and logs the result.
### 🔐 Error Handling
- Validates types of all registration arguments.
- Logs malformed or missing hostnames during detection.
- Silently fails for misconfiguration, aiding fault tolerance.
### 🧪 Debug Logging
- Logs the entire provider map for visibility on each detection.
- Confirms successful matches and domain checks.
### ✅ Summary
[`provider-utils.js`](extension/providers/provider-utils.js:1) provides a lightweight and dynamic mechanism for associating hostnames with provider implementations. It ensures extensibility for future integrations while remaining simple and debuggable.
---
## 2. AI Studio Provider (`AIStudioProvider`)
The [`AIStudioProvider`](extension/providers/aistudio.js:3) class is used to interact with Google AI Studio's web interface.
### 🧩 Overview
[`AIStudioProvider`](extension/providers/aistudio.js:3) is a browser-automated provider class for sending messages and capturing responses from `aistudio.google.com`. It offers support for DOM or Chrome Debugger-based response parsing and optionally enables function-calling features.
### ⚙️ Configurable Options
```js
this.captureMethod = "debugger"; // or "dom"
this.debuggerUrlPattern = "*MakerSuiteService/GenerateContent*";
this.includeThinkingInMessage = false;
this.ENABLE_AISTUDIO_FUNCTION_CALLING = true;
```
These parameters control how responses are captured and whether additional intermediate output like “thinking” is included in results.
### 📌 DOM Selectors
- Input field: [`this.inputSelector`](extension/providers/aistudio.js:24)
- Send button: [`this.sendButtonSelector`](extension/providers/aistudio.js:27)
- Main response blocks: [`this.responseSelector`](extension/providers/aistudio.js:30)
- Typing indicators: [`this.thinkingIndicatorSelector`](extension/providers/aistudio.js:33)
Fallback selectors are used when DOM capture is selected and standard elements fail.
### 🔄 Lifecycle
#### 1. Initialization
- Assigns selectors and default behavior
- Enables function calling via [`ensureFunctionCallingEnabled()`](extension/providers/aistudio.js:72)
- Binds to `window.navigation` for SPA-aware page detection
#### 2. Message Sending
- [`sendChatMessage(text)`](extension/providers/aistudio.js:131): Finds input field and button, inserts text, and triggers a click with retry and verification logic.
#### 3. Response Capture
##### Method: Debugger
- Registers callback: [`initiateResponseCapture()`](extension/providers/aistudio.js:209)
- Handles debugger message: [`handleDebuggerData()`](extension/providers/aistudio.js:226)
- Parses chunks via [`parseDebuggerResponse()`](extension/providers/aistudio.js:439)
##### Method: DOM
- Starts mutation observer loop: [`_startDOMMonitoring()`](extension/providers/aistudio.js:598)
- Identifies end of generation: [`_isResponseStillGeneratingDOM()`](extension/providers/aistudio.js:577)
### 🛡️ Error & Edge Case Handling
- Detects failed button presses
- Gracefully handles unknown capture methods
- Times out function-calling polling after 7s with fallback logging
### 🔧 Function Calling Enable Logic
```mermaid
sequenceDiagram
participant Provider
participant DOM
Provider->>DOM: Query 'button[aria-label="Function calling"]'
DOM-->>Provider: aria-checked="false"
Provider->>DOM: click()
DOM-->>Provider: recheck aria-checked
```
### ✅ Summary
This provider enables integration with AI Studio via browser automation. Its flexibility in capture methods and dynamic DOM monitoring makes it robust for a range of layout changes or interface evolutions.
---
## 3. ChatGPT Provider (`ChatGptProvider`)
The [`ChatGptProvider`](extension/providers/chatgpt.js:3) class automates and captures interactions with `chatgpt.com`.
### 🧩 Overview
[`ChatGptProvider`](extension/providers/chatgpt.js:3) enables message injection, UI automation, and response capture from ChatGPT via DOM or Chrome Debugger methods. It uses retry logic for robust message delivery and streaming capture for response chunks.
### ⚙️ Configurable Parameters
```js
this.captureMethod = "debugger"; // or "dom"
this.debuggerUrlPattern = "*chatgpt.com/backend-api/conversation*";
this.includeThinkingInMessage = true;
```
These control the response source (network or UI) and message formatting behavior.
### 📌 DOM Elements
- Input: [`#prompt-textarea`](extension/providers/chatgpt.js:12)
- Send button: [`#composer-submit-button`](extension/providers/chatgpt.js:13)
- Response area: [`.message-bubble .text-content`](extension/providers/chatgpt.js:14)
- Loading spinner: [`.loading-spinner`](extension/providers/chatgpt.js:15)
- Fallback DOM: [`.message-container .response-text`](extension/providers/chatgpt.js:16)
### 🔄 Lifecycle
#### 1. Initialization
- Sets up DOM selectors and state containers
- Initializes request tracking maps and debug logs
#### 2. Sending Messages
- [`sendChatMessage()`](extension/providers/chatgpt.js:25):
- Finds input + button, injects message
- Retries on failure (up to 5 times)
- Waits between attempts and checks element readiness
#### 3. Capturing Responses
##### A. Debugger Mode
- Callback registration via [`initiateResponseCapture()`](extension/providers/chatgpt.js:105)
- Processes data in [`handleDebuggerData()`](extension/providers/chatgpt.js:121)
- Uses accumulator map for text chunk assembly and [`parseDebuggerResponse()`](extension/providers/chatgpt.js:199) to interpret SSE stream format
##### B. DOM Mode
- Starts DOM observer loop via [`_startDOMMonitoring()`](extension/providers/chatgpt.js:499)
- Stops when [`_isResponseStillGeneratingDOM()`](extension/providers/chatgpt.js:489) returns false
### 🛡️ Error Handling
- [`_reportSendError()`](extension/providers/chatgpt.js:93) reports issues back to callback
- Handles:
- Missing input or button
- Disabled controls
- Empty raw debugger data
- Unparseable or non-relevant JSON payloads
### 🧠 Streaming SSE Parse Logic
```mermaid
sequenceDiagram
participant Debugger
participant Provider
participant ContentScript
Debugger-->>Provider: data: { chunk }
Provider->>Provider: parseDebuggerResponse()
Provider-->>ContentScript: Accumulated text + isFinal
```
### ✅ Summary
The ChatGPT provider is designed for robust interaction with ChatGPT's UI or network. It supports retries, error recovery, and chunked response reconstruction, ensuring high reliability and compatibility across updates to the site's frontend.
---
## 4. Claude Provider (`ClaudeProvider`)
The [`ClaudeProvider`](./provider-claude.md) class is used to interact with Anthropic's Claude AI models via the `claude.ai` web interface.
### 🧩 Overview
[`ClaudeProvider`](./provider-claude.md) primarily uses the Chrome DevTools Debugger API to intercept and process Server-Sent Events (SSE) for streaming responses from `claude.ai`.
### âš™ï¸ Configurable Options
```js
this.captureMethod = "debugger";
this.debuggerUrlPattern = "*/completion*"; // Matches Claude's streaming endpoint
this.includeThinkingInMessage = false; // Default, focuses on final answer
this.ENABLE_CLAUDE_FUNCTION_CALLING = true; // Currently unused, logic commented out
```
These parameters control how responses are captured.
### 📌 DOM Selectors
- Input field: [`this.inputSelector`](extension/providers/claude.js:24) (`div.ProseMirror[contenteditable="true"]`)
- Send button: [`this.sendButtonSelector`](extension/providers/claude.js:27) (`button[aria-label="Send message"]`)
- Response area (for DOM fallback): [`this.responseSelector`](extension/providers/claude.js:30)
- Thinking indicator (for DOM fallback): [`this.thinkingIndicatorSelector`](extension/providers/claude.js:33)
### 🔄 Lifecycle
#### 1. Initialization
- Assigns selectors and default behavior.
- Initializes `pendingResponseCallbacks` and `requestBuffers` for managing streaming responses.
- (Function calling enablement logic is currently commented out).
#### 2. Message Sending
- [`sendChatMessage(messageContent)`](extension/providers/claude.js:135):
- Handles string, Blob, or array (text/image_url) content.
- Sets text content on the input field.
- Dispatches a `ClipboardEvent('paste')` for image data.
- Clicks the send button with retry logic if initially disabled.
#### 3. Response Capture (Debugger Method)
- Callback registration via [`initiateResponseCapture()`](extension/providers/claude.js:264).
- Handles debugger data in [`handleDebuggerData()`](extension/providers/claude.js:281):
- Accumulates text from multiple SSE chunks in `requestBuffers`.
- Uses [`parseDebuggerResponse()`](extension/providers/claude.js:336) to interpret Claude's SSE stream.
- Calls the main response callback only with the complete message when `includeThinkingInMessage` is `false` and an end-of-message event (`message_stop` or `message_delta` with `stop_reason`) is detected, or when the background script signals the absolute end of the stream.
### 🧠 Streaming SSE Parse Logic (`parseDebuggerResponse`)
- Splits raw data into individual SSE messages.
- Extracts `event:` type and `data:` payload.
- Appends text from `content_block_delta` events.
- Sets `isFinalResponse: true` if `event: message_stop` is seen or if `event: message_delta` contains a `stop_reason`.
### ✅ Summary
The Claude provider enables integration with `claude.ai` by intercepting and parsing its SSE stream. Its logic is tailored to accumulate streamed message parts and deliver a complete response.
---
## 5. Provider Comparison
This section compares `extension/providers/chatgpt.js` and `extension/providers/aistudio.js`.
### 1. Configurable Properties
| Property | ChatGptProvider | AIStudioProvider |
| -------------------------- | ---------------------------------------- | ------------------------------------------------------- |
| `captureMethod` | "debugger" | "debugger" |
| `debuggerUrlPattern` | `*chatgpt.com/backend-api/conversation*` | `*MakerSuiteService/GenerateContent*` |
| `includeThinkingInMessage` | `true` | `false` |
| Function calling toggle | N/A | `ENABLE_AISTUDIO_FUNCTION_CALLING` (with polling logic) |
### 2. Provider Identity
* **Name**:
* ChatGPT: `ChatGptProvider`
* AI Studio: `AIStudioProvider`
* **Supported Domains**:
* ChatGPT: `["chatgpt.com"]`
* AI Studio: `["aistudio.google.com"]`
### 3. Selectors & UI Interactions
| Selector Type | ChatGPT (`chatgpt.js`) | AI Studio (`aistudio.js`) |
| ------------------------ | ------------------------------------ | ------------------------------------------------------------ |
| Input Field | `#prompt-textarea` | `textarea.textarea`, `textarea.gmat-body-medium`, etc. |
| Send Button | `button[data-testid="send-button"]` | `button.run-button`, `button[aria-label="Run"]`, etc. |
| Response Capture (DOM) | `.message-bubble .text-content` | `.response-container`, `.model-response`, `.cmark-node`, ... |
| Thinking Indicator (DOM) | `.loading-spinner`, `.thinking-dots` | `.thinking-indicator`, `.loading-indicator`, etc. |
### 4. Message Sending Logic
* **ChatGptProvider**:
* Supports string, Blob, and array payloads.
* Sets `innerText` on a contenteditable div, uses `ClipboardEvent('paste')` for images.
* Retries clicking send up to 5 times with exponential backoff.
* **AIStudioProvider**:
* Handles similar payload types but pastes via `inputField.value` and paste event.
* Retries send click up to 60 attempts (5-minute total), triggering UI events to enable button.
### 5. Response Capture Mechanisms
* **Debugger-Based Streaming**:
* **ChatGPT**: Complex SSE parsing supporting thoughts, reasoning recaps, JSON patches, and OpenAI deltas.
* **AI Studio**: Simplified JSON array parsing with `extractTextSegments` and `findEndOfUnitMarker`, plus `includeThinkingInMessage` toggle.
* **DOM Fallback**:
* Both implement DOM monitoring, but selectors and timing differ:
* ChatGPT polls every 500ms, with stability checks and cleanup on final.
* AI Studio polls every 1s up to 15s, with fallback search through multiple DOM patterns.
### 6. Registration
Both providers register themselves via:
```js
window.providerUtils.registerProvider(
providerInstance.name,
providerInstance.supportedDomains,
providerInstance
);
```
### Summary of Key Differences:
1. **Parsing complexity**: ChatGPT provider includes extensive SSE parsing for multiple content types, while AI Studio uses a simpler array-based JSON parsing.
2. **Thinking inclusion**: ChatGPT merges thoughts and content by default; AI Studio omits thoughts unless toggled.
3. **UI selectors**: AI Studio supports diverse textarea and button selectors for its SPA, plus function calling toggle logic.
4. **Retry strategy**: ChatGPT uses a 5-attempt loop; AI Studio retries over a longer window (up to 5 minutes).
This comparison highlights the design choices and implementations of each provider.
---
## 6. Conclusion
This consolidated documentation provides a holistic view of the provider architecture, detailing the utilities for managing providers and the specifics of the AI Studio, ChatGPT, and Claude providers. This unified approach aims to simplify understanding and maintenance of the provider system.

104
docs/content-script.md Normal file
View file

@ -0,0 +1,104 @@
# Content Script Architecture (`content.js`)
This document describes the structure and flow of [`content.js`](extension/content.js:1), which acts as the in-page automation engine for injecting chat messages and monitoring responses in various AI web interfaces.
---
## 🧩 Overview
The content script:
1. Detects which AI platform is active (e.g., ChatGPT, AI Studio).
2. Loads the appropriate provider.
3. Sends chat messages programmatically.
4. Monitors the DOM for typed/streamed responses.
5. Reports back results to the background script.
---
## ⚙️ Initialization
### Boot Sequence
- Executed on load: [`initializeContentRelay()`](extension/content.js:60)
- Identifies provider using `window.providerUtils`
- Sends [`CHAT_RELAY_READY`](extension/content.js:105) to background
- Sets up provider capture and polling logic
```mermaid
sequenceDiagram
participant ContentScript
participant Provider
participant Background
ContentScript->>Provider: detectProvider()
ContentScript->>Background: CHAT_RELAY_READY
ContentScript->>DOM: monitor inputs/buttons
ContentScript->>DOM: start polling
```
---
## 📌 DOM Utilities
- Element detection: [`findPotentialSelectors()`](extension/content.js:18)
- Polling loop: [`startElementPolling()`](extension/content.js:139)
Searches for selectors like:
```js
provider.inputSelector
provider.sendButtonSelector
provider.responseSelector
```
---
## 💬 Message Injection
### Method: [`sendChatMessage(text)`](extension/content.js:172)
- Calls [`sendChatMessageWithRetry()`](extension/content.js:183)
- Retries sending if button/input not ready
- Uses `provider.sendChatMessage()`
---
## 📥 Response Capture
### A. DOM Monitoring
- [`setupAutomaticResponseCapture()`](extension/content.js:338): sets MutationObserver
- [`startMonitoringForResponse()`](extension/content.js:255): begins timed attempts
- Captures response with [`captureResponse()`](extension/content.js:505)
### B. Completion Detection
Platform-specific heuristics like:
- [`monitorGeminiResponse()`](extension/content.js:436)
- [`monitorResponseCompletion()`](extension/content.js:391)
---
## 🔁 Provider Coordination
- Uses shared `provider` object initialized at runtime
- Each provider has methods like:
- `sendChatMessage()`
- `getStreamingApiPatterns()`
- `shouldSkipResponseMonitoring()`
---
## 🛡️ Fault Recovery
- Logs missing selectors or failures
- Retries up to 5x for message dispatch
- Reports failure back to background
---
## ✅ Summary
[`content.js`](extension/content.js:1) dynamically adapts to different AI platforms to send and monitor chat activity. It forms the frontend relay bridge between user intent (via server) and provider-specific DOM interfaces.

View file

@ -0,0 +1,261 @@
# Design Document: AIStudioProviderV2 - Debugger-Based Message Sending
**Version:** 1.0
**Date:** 2025-05-08
## 1. Objective
To implement an alternative message sending mechanism for the AI Studio provider (`AIStudioProviderV2`) that utilizes the Chrome Debugger API to directly issue network requests to the `GenerateContent` endpoint. This approach aims to bypass DOM manipulation for sending messages, potentially increasing reliability and reducing flakiness associated with UI interactions.
## 2. Rationale
* **Increased Robustness:** Directly crafting and sending network requests can be more stable than simulating user interactions with DOM elements (input fields, buttons), which can be affected by page updates, dynamic attributes, or timing issues.
* **Reduced Complexity:** Eliminates the need for complex DOM selector logic, event dispatching (input, change, blur), and send button state checking/retries.
* **Leverage Existing Infrastructure:** The debugger API is already in use for capturing responses. Extending its use for sending creates a more unified interaction model with the target service.
* **Decoupling from UI Changes:** Less susceptible to breakage if AI Studio's UI structure (selectors, button states) changes.
## 3. Key Components & Changes
### 3.1. New Provider File: `extension/providers/aistudio_v2.js`
* **Creation:** This file will be a new provider, largely based on the existing [`extension/providers/aistudio.js`](extension/providers/aistudio.js:1).
* **Class Name:** `AIStudioProviderV2`
* **`captureMethod`:** Will remain `"debugger"` as response capture logic will be similar.
* **`sendChatMessage(text, requestId)` method:** This will be the primary method modified to implement debugger-based sending.
* **Other methods:** `initiateResponseCapture`, `handleDebuggerData`, `parseDebuggerResponse` will likely remain very similar or identical to `aistudio.js`, as response handling is unchanged.
### 3.2. `sendChatMessage` in `aistudio_v2.js` (Debugger-based Sending)
This method will no longer interact with the DOM to fill input fields or click buttons. Instead, it will:
1. **Store `lastSentMessage`:** `this.lastSentMessage = text;` (still useful for context, though not for DOM comparison).
2. **Define a "Dummy" Request URL:** Create a unique, identifiable URL pattern that `background.js` can intercept. This URL won't actually be hit on the network if intercepted correctly at the "Request" stage.
* Example: `const dummyUrl = \`https://aistudio.google.com/__aicr_debugger_send__/${requestId}?ts=\${Date.now()}\`;`
* The `requestId` and timestamp help in making it unique and potentially passing info.
3. **Trigger the Dummy Request:** Use `fetch` to initiate this dummy request. The `fetch` call itself is just a trigger.
\`\`\`javascript
try {
// The body of this dummy fetch can carry the actual message payload
// for easier access in background.js if needed, or background.js
// can get it from its own context of the original SEND_CHAT_MESSAGE command.
await fetch(dummyUrl, {
method: 'POST', // Using POST to easily send a body
body: JSON.stringify({
action: "AICR_PROXY_SEND",
originalMessage: text,
originalRequestId: requestId,
// Include any other relevant settings from the original command if needed by background.js
// e.g., model, temperature, if these are to be part of the proxied request.
}),
headers: {
'Content-Type': 'application/json'
}
});
// This fetch doesn't need to "succeed" in the traditional sense.
// Its purpose is to be intercepted.
// The success/failure of the *actual* GenerateContent call will be handled
// via messages from background.js.
console.log(\`[\${this.name}] Dummy request triggered for requestId: \${requestId}\`);
return true; // Indicates the process to send via debugger was initiated.
// Actual success depends on background script's actions.
} catch (error) {
console.error(\`[\${this.name}] Error triggering dummy request for debugger send:\`, error);
return false;
}
\`\`\`
4. **Return Value:** The function should return `true` if the dummy request was successfully initiated, or `false` on an immediate error. The actual success of sending the message to AI Studio will be asynchronous and depend on `background.js`.
### 3.3. `background.js` Modifications
The `chrome.debugger.onEvent` listener for `Fetch.requestPaused` needs to be enhanced:
1. **Identify Dummy Request:**
\`\`\`javascript
if (message === "Fetch.requestPaused" && params.request.url.includes("__aicr_debugger_send__")) {
const debuggeeId = { tabId: tabId }; // Ensure debuggeeId is correctly defined
const interceptedRequestId = params.requestId; // This is the debugger's internal ID for the fetch
if (params.requestStage === 'Request') {
console.log(BG_LOG_PREFIX, \`Intercepted AICR_PROXY_SEND dummy request (ID: \${interceptedRequestId}) for tab \${tabId}\`);
try {
const postDataString = params.request.postData ? atob(params.request.postData) : null;
const dummyPayload = postDataString ? JSON.parse(postDataString) : {};
const originalMessage = dummyPayload.originalMessage;
const appRequestId = dummyPayload.originalRequestId; // The extension's internal requestId
if (!originalMessage || appRequestId === undefined) {
console.error(BG_LOG_PREFIX, "Dummy request missing originalMessage or appRequestId. Aborting proxy send.");
chrome.debugger.sendCommand(debuggeeId, "Fetch.failRequest", { requestId: interceptedRequestId, errorReason: "InvalidParams" });
// Optionally send an error message back to content script
return;
}
// TODO: Retrieve current model, temperature, etc. if needed.
// These might come from the original SEND_CHAT_MESSAGE command stored with appRequestId,
// or from extension settings. For now, we can hardcode or use defaults.
const modelName = "models/gemini-2.5-pro-preview-05-06"; // Example
// Construct the GenerateContent payload (this is the critical part)
const generateContentPayload = [
modelName,
[[[[null, originalMessage]], "user"]], // Simplified for a single user message
null, null,
[null, null, null, null, [null, null, null, null, null, 65536, 1, 0.95, 64, "text/plain", null, null, null, null, null, null, []], [1]]
];
const generateContentUrl = "https://alkalimakersuite-pa.clients6.google.com/$rpc/google.internal.alkali.applications.makersuite.v1.MakerSuiteService/GenerateContent";
const headers = [
{ name: "Content-Type", value: "application/json+protobuf" },
// Add any other headers observed in a legitimate request if necessary
// e.g., X-Goog-Api-Key, Authorization (if they are static or can be obtained)
// Cookies are usually sent automatically by the browser when transforming a request from the page's context.
];
console.log(BG_LOG_PREFIX, \`Transforming dummy request to POST \${generateContentUrl} for appRequestId: \${appRequestId}\`);
chrome.debugger.sendCommand(debuggeeId, "Fetch.continueRequest", {
requestId: interceptedRequestId, // The debugger's ID for the fetch being modified
url: generateContentUrl,
method: "POST",
postData: btoa(JSON.stringify(generateContentPayload)), // Must be base64 encoded
headers: headers
}, () => {
if (chrome.runtime.lastError) {
console.error(BG_LOG_PREFIX, \`Error calling Fetch.continueRequest for proxied send:\`, chrome.runtime.lastError.message);
// TODO: Notify content script of failure
} else {
console.log(BG_LOG_PREFIX, \`Successfully submitted transformed request for appRequestId: \${appRequestId}. Debugger will capture response.\`);
// The existing response capture logic for this URL should now take over.
// Ensure tabInfo.lastKnownRequestId is set correctly for this appRequestId.
const tabInfo = debuggerAttachedTabs.get(tabId);
if (tabInfo) {
tabInfo.lastKnownRequestId = appRequestId;
}
}
});
} catch (e) {
console.error(BG_LOG_PREFIX, "Error processing dummy request for proxy send:", e);
chrome.debugger.sendCommand(debuggeeId, "Fetch.failRequest", { requestId: interceptedRequestId, errorReason: "ProcessingFailed" });
}
return; // Handled
} else if (params.requestStage === 'Response') {
// This is the "response" to our dummy fetch. We can just complete it.
// The actual GenerateContent response will be a separate Fetch.requestPaused event.
console.log(BG_LOG_PREFIX, \`Completing dummy request (ID: \${interceptedRequestId}) at Response stage.\`);
chrome.debugger.sendCommand(debuggeeId, "Fetch.fulfillRequest", {
requestId: interceptedRequestId,
responseCode: 200,
responseHeaders: [{ name: "Content-Type", value: "application/json" }],
body: btoa(JSON.stringify({ success: true, message: "Dummy request processed by background." }))
});
return; // Handled
}
}
// ... existing Fetch.requestPaused logic for capturing actual GenerateContent responses ...
\`\`\`
2. **Payload Construction:** The `generateContentPayload` needs to be meticulously crafted based on observed network requests (like the screenshot provided). Initially, it can be simplified for a single message turn. History and other parameters can be added later.
3. **State Management:** Ensure `tabInfo.lastKnownRequestId` in `debuggerAttachedTabs` is correctly associated with the `appRequestId` of the message being sent via this proxy method, so the subsequent response capture links correctly.
### 3.4. `content.js` Modifications
1. **Provider Selection:**
* On initialization, `content.js` will read a setting from `chrome.storage.sync` (e.g., `aistudioSendMethod: "dom" | "debugger"`).
* Based on this setting, it will instantiate either `AIStudioProvider` or `AIStudioProviderV2`.
\`\`\`javascript
// In content.js
let activeProvider = null;
chrome.storage.sync.get({ aistudioSendMethod: "dom" }, (settings) => {
if (settings.aistudioSendMethod === "debugger" && typeof AIStudioProviderV2 !== 'undefined') {
activeProvider = new AIStudioProviderV2();
console.log("[CS CONTENT] Using AIStudioProviderV2 (Debugger Send)");
} else {
activeProvider = new AIStudioProvider();
console.log("[CS CONTENT] Using AIStudioProvider (DOM Send)");
}
// ... rest of initialization that uses activeProvider ...
// Inform background about debugger targets if using debugger for response
if (activeProvider.captureMethod === "debugger") {
chrome.runtime.sendMessage({
type: "SET_DEBUGGER_TARGETS",
providerName: activeProvider.name, // Ensure V2 has a distinct name if needed for logs
patterns: [{ urlPattern: activeProvider.debuggerUrlPattern }]
});
}
});
\`\`\`
2. **Dynamic Switching (Optional):** Listen to `chrome.storage.onChanged` to re-initialize with the correct provider if the user changes the setting while the page is active. This might involve tearing down the old provider instance.
### 3.5. Popup UI (`popup.html`, `popup.js`)
1. **HTML:** Add a radio button group or a select dropdown in `popup.html`:
\`\`\`html
<div>
<label>AI Studio Send Method:</label>
<select id="aistudioSendMethod">
<option value="dom">DOM Interaction</option>
<option value="debugger">Debugger API</option>
</select>
</div>
\`\`\`
2. **JavaScript (`popup.js`):**
* Load the current setting on popup open and set the UI element's state.
* Save the selected value to `chrome.storage.sync` when it changes.
\`\`\`javascript
// In popup.js
const sendMethodSelect = document.getElementById('aistudioSendMethod');
chrome.storage.sync.get({ aistudioSendMethod: "dom" }, (items) => {
sendMethodSelect.value = items.aistudioSendMethod;
});
sendMethodSelect.addEventListener('change', (event) => {
chrome.storage.sync.set({ aistudioSendMethod: event.target.value });
});
\`\`\`
### 3.6. Manifest (`manifest.json`)
1. **Content Scripts:** Ensure `aistudio_v2.js` is listed as a content script for `aistudio.google.com` domains, similar to `aistudio.js`.
\`\`\`json
"content_scripts": [
{
"matches": ["*://*.google.com/*"], // Broad, refine if possible
"js": ["extension/common.js", "extension/content.js"],
"css": ["css/content.css"]
},
{
"matches": ["*://aistudio.google.com/*"],
"js": ["extension/providers/aistudio.js", "extension/providers/aistudio_v2.js"],
"all_frames": true
}
// ... other provider scripts ...
],
\`\`\`
2. **Permissions:** The `debugger` permission should already be present.
## 4. Implementation Steps & Order
1. **Create `aistudio_v2.js`:** Copy `aistudio.js`, rename class to `AIStudioProviderV2`. Modify `sendChatMessage` to trigger the dummy fetch.
2. **Update `manifest.json`:** Add `aistudio_v2.js` to content scripts.
3. **Implement Popup UI:** Add HTML and JS for the send method selector.
4. **Modify `content.js`:** Implement provider selection logic based on `chrome.storage.sync`.
5. **Modify `background.js`:**
* Add logic to `Fetch.requestPaused` to detect and handle the `__aicr_debugger_send__` dummy request.
* Implement the transformation to a `GenerateContent` POST request.
* Ensure `lastKnownRequestId` is correctly managed for the proxied request.
6. **Testing & Refinement:**
* Test sending with the "Debugger API" option selected.
* Verify request construction in `background.js` logs.
* Verify response capture.
* Test switching between DOM and Debugger methods.
## 5. Potential Challenges & Considerations
* **`GenerateContent` Payload Complexity:** This is the most critical and fragile part. The payload structure must be exact. Any changes by Google to this private API could break it.
* **Authentication/Session Headers:** While `Fetch.continueRequest` usually handles cookies correctly, any special headers required by `GenerateContent` must be identified and included.
* **Error Handling:** Robust error handling is needed if `Fetch.continueRequest` fails, or if the transformed request is rejected by the server. Notifications back to the content script/provider are important.
* **Dynamic Parameters:** Model name, temperature, etc., are currently hardcoded in the plan. A more advanced implementation would make these configurable or derive them from the current AI Studio UI/settings if possible.
* **Security:** Ensure the `debuggerUrlPattern` for response capture and the dummy URL pattern are specific enough to avoid unintended interceptions.

85
docs/provider-aistudio.md Normal file
View file

@ -0,0 +1,85 @@
# AI Studio Provider Architecture
This document outlines the implementation and operational logic of the [`AIStudioProvider`](extension/providers/aistudio.js:3) class used in the extension to interact with Google AI Studio's web interface.
---
## 🧩 Overview
[`AIStudioProvider`](extension/providers/aistudio.js:3) is a browser-automated provider class for sending messages and capturing responses from `aistudio.google.com`. It offers support for DOM or Chrome Debugger-based response parsing and optionally enables function-calling features.
---
## ⚙️ Configurable Options
```js
this.captureMethod = "debugger"; // or "dom"
this.debuggerUrlPattern = "*MakerSuiteService/GenerateContent*";
this.includeThinkingInMessage = false;
this.ENABLE_AISTUDIO_FUNCTION_CALLING = true;
```
These parameters control how responses are captured and whether additional intermediate output like “thinking” is included in results.
---
## 📌 DOM Selectors
- Input field: [`this.inputSelector`](extension/providers/aistudio.js:24)
- Send button: [`this.sendButtonSelector`](extension/providers/aistudio.js:27)
- Main response blocks: [`this.responseSelector`](extension/providers/aistudio.js:30)
- Typing indicators: [`this.thinkingIndicatorSelector`](extension/providers/aistudio.js:33)
Fallback selectors are used when DOM capture is selected and standard elements fail.
---
## 🔄 Lifecycle
### 1. Initialization
- Assigns selectors and default behavior
- Enables function calling via [`ensureFunctionCallingEnabled()`](extension/providers/aistudio.js:72)
- Binds to `window.navigation` for SPA-aware page detection
### 2. Message Sending
- [`sendChatMessage(text)`](extension/providers/aistudio.js:131): Finds input field and button, inserts text, and triggers a click with retry and verification logic.
### 3. Response Capture
#### Method: Debugger
- Registers callback: [`initiateResponseCapture()`](extension/providers/aistudio.js:209)
- Handles debugger message: [`handleDebuggerData()`](extension/providers/aistudio.js:226)
- Parses chunks via [`parseDebuggerResponse()`](extension/providers/aistudio.js:439)
#### Method: DOM
- Starts mutation observer loop: [`_startDOMMonitoring()`](extension/providers/aistudio.js:598)
- Identifies end of generation: [`_isResponseStillGeneratingDOM()`](extension/providers/aistudio.js:577)
---
## 🛡️ Error & Edge Case Handling
- Detects failed button presses
- Gracefully handles unknown capture methods
- Times out function-calling polling after 7s with fallback logging
---
## 🔧 Function Calling Enable Logic
```mermaid
sequenceDiagram
participant Provider
participant DOM
Provider->>DOM: Query 'button[aria-label="Function calling"]'
DOM-->>Provider: aria-checked="false"
Provider->>DOM: click()
DOM-->>Provider: recheck aria-checked
```
---
## ✅ Summary
This provider enables integration with AI Studio via browser automation. Its flexibility in capture methods and dynamic DOM monitoring makes it robust for a range of layout changes or interface evolutions.

92
docs/provider-chatgpt.md Normal file
View file

@ -0,0 +1,92 @@
# ChatGPT Provider Architecture
This document describes the implementation details of the [`ChatGptProvider`](extension/providers/chatgpt.js:3) class used to automate and capture interactions with `chatgpt.com`.
---
## 🧩 Overview
[`ChatGptProvider`](extension/providers/chatgpt.js:3) enables message injection, UI automation, and response capture from ChatGPT via DOM or Chrome Debugger methods. It uses retry logic for robust message delivery and streaming capture for response chunks.
---
## ⚙️ Configurable Parameters
```js
this.captureMethod = "debugger"; // or "dom"
this.debuggerUrlPattern = "*chatgpt.com/backend-api/conversation*";
this.includeThinkingInMessage = true;
```
These control the response source (network or UI) and message formatting behavior.
---
## 📌 DOM Elements
- Input: [`#prompt-textarea`](extension/providers/chatgpt.js:12)
- Send button: [`#composer-submit-button`](extension/providers/chatgpt.js:13)
- Response area: [`.message-bubble .text-content`](extension/providers/chatgpt.js:14)
- Loading spinner: [`.loading-spinner`](extension/providers/chatgpt.js:15)
- Fallback DOM: [`.message-container .response-text`](extension/providers/chatgpt.js:16)
---
## 🔄 Lifecycle
### 1. Initialization
- Sets up DOM selectors and state containers
- Initializes request tracking maps and debug logs
### 2. Sending Messages
- [`sendChatMessage()`](extension/providers/chatgpt.js:25):
- Finds input + button, injects message
- Retries on failure (up to 5 times)
- Waits between attempts and checks element readiness
### 3. Capturing Responses
#### A. Debugger Mode
- Callback registration via [`initiateResponseCapture()`](extension/providers/chatgpt.js:105)
- Processes data in [`handleDebuggerData()`](extension/providers/chatgpt.js:121)
- Uses accumulator map for text chunk assembly and [`parseDebuggerResponse()`](extension/providers/chatgpt.js:199) to interpret SSE stream format
#### B. DOM Mode
- Starts DOM observer loop via [`_startDOMMonitoring()`](extension/providers/chatgpt.js:499)
- Stops when [`_isResponseStillGeneratingDOM()`](extension/providers/chatgpt.js:489) returns false
---
## 🛡️ Error Handling
- [`_reportSendError()`](extension/providers/chatgpt.js:93) reports issues back to callback
- Handles:
- Missing input or button
- Disabled controls
- Empty raw debugger data
- Unparseable or non-relevant JSON payloads
---
## 🧠 Streaming SSE Parse Logic
```mermaid
sequenceDiagram
participant Debugger
participant Provider
participant ContentScript
Debugger-->>Provider: data: { chunk }
Provider->>Provider: parseDebuggerResponse()
Provider-->>ContentScript: Accumulated text + isFinal
```
---
## ✅ Summary
The ChatGPT provider is designed for robust interaction with ChatGPT's UI or network. It supports retries, error recovery, and chunked response reconstruction, ensuring high reliability and compatibility across updates to the site's frontend.

61
docs/provider-claude.md Normal file
View file

@ -0,0 +1,61 @@
# Claude Provider (`claude.js`)
The `ClaudeProvider` is designed to interface with Anthropic's Claude AI models via the `claude.ai` web interface. It primarily uses the Chrome DevTools Debugger API to intercept and process Server-Sent Events (SSE) for streaming responses.
## Key Features
- **Supported Domain:** `claude.ai`
- **Capture Method:** `debugger` (primary)
- Intercepts network responses matching the `debuggerUrlPattern`.
- Parses Server-Sent Events (SSE) to extract message content.
- **DOM Fallback:** Basic DOM capture logic exists but is secondary to the debugger method.
- **Stream Handling:**
- Accumulates text from `content_block_delta` SSE events.
- Identifies the end of a message by detecting `message_stop` events or `message_delta` events with a `stop_reason`.
- When `includeThinkingInMessage` is `false` (default), it sends the complete, accumulated message once the stream indicates completion.
## Configuration Properties
Located at the beginning of the `ClaudeProvider` class in `extension/providers/claude.js`:
- `this.captureMethod`: (String) Set to `"debugger"` for SSE interception. Can be set to `"dom"` for DOM-based capture (less reliable for streaming).
- `this.debuggerUrlPattern`: (String) URL pattern used by the debugger to identify Claude's response stream. Currently set to `"*\/completion*"`.
- `this.includeThinkingInMessage`: (Boolean) If `true`, the provider attempts to include intermediate "thinking" steps (not fully implemented/tested for Claude's SSE structure). Defaults to `false`, focusing on the final answer.
- `this.ENABLE_CLAUDE_FUNCTION_CALLING`: (Boolean) Intended for future use if Claude exposes a UI toggle for function calling. Currently, the related code is commented out as no such toggle is present.
## DOM Selectors
These selectors are used to interact with the Claude web interface:
- `this.inputSelector`: `'div.ProseMirror[contenteditable="true"]'` (The main chat input field)
- `this.sendButtonSelector`: `'button[aria-label="Send message"]'` (The button to send a message)
- `this.responseSelector`: A general selector for identifying response areas, primarily for DOM fallback (`.response-container, .response-text, .model-response, ...`).
- `this.thinkingIndicatorSelector`: Selectors for loading/thinking indicators, primarily for DOM fallback.
## Debugger Stream Parsing (`parseDebuggerResponse`)
This method is crucial for handling the SSE stream from Claude:
1. **Splits Chunks:** Each raw data chunk from the debugger can contain multiple SSE messages (e.g., `event: ...\ndata: ...\n\n`). The method splits these.
2. **Event Extraction:** For each SSE message, it extracts the `event:` type and `data:` payload.
3. **Text Accumulation:**
- If `event: content_block_delta` and `data.delta.type === "text_delta"`, the `data.delta.text` is appended to the current chunk's text.
4. **End-of-Message Detection:**
- If `event: message_stop` is encountered, the message is considered complete.
- If `event: message_delta` and the `data.delta.stop_reason` field is present, the message is considered complete.
5. **Output:** Returns an object `{ text: "accumulated_text_from_this_chunk", isFinalResponse: true/false }`.
## Message Handling (`handleDebuggerData`)
1. **Buffering:** Uses `this.requestBuffers` to accumulate text for each `requestId` across multiple data chunks.
2. **Callback Invocation:**
- If `includeThinkingInMessage` is `false`:
- The main `responseCallback` (which sends data back to the extension's background script) is only called with the fully accumulated text when `parseDebuggerResponse` indicates `isFinalResponse: true` for a chunk, OR when the background script signals that this is the absolute final chunk from the debugger (`isFinalFromBackground: true`).
- If `includeThinkingInMessage` is `true` (not the current default):
- It would send intermediate text chunks.
## Known Issues & Considerations
- **`debuggerUrlPattern` Specificity:** The accuracy of `this.debuggerUrlPattern` is critical. It must precisely match the URL endpoint from which Claude serves its chat responses. The current value is `"*\/completion*"`.
- **Function Calling:** The `ensureFunctionCallingEnabled` logic is currently commented out as there is no visible UI toggle for this feature on `claude.ai`.
- **Error Handling:** Basic error handling is in place, but complex network error scenarios or unexpected API changes from Claude might require more robust handling.

117
docs/provider-comparison.md Normal file
View file

@ -0,0 +1,117 @@
# Provider File Comparison
Comparing:
* `extension/providers/chatgpt.js`
* `extension/providers/aistudio.js`
* `extension/providers/claude.js`
---
## 1. Configurable Properties
| Property | ChatGptProvider | AIStudioProvider | ClaudeProvider |
| -------------------------- | ---------------------------------------- | ------------------------------------------------------- | ------------------------------------------------------- |
| `captureMethod` | "debugger" | "debugger" | "debugger" |
| `debuggerUrlPattern` | `*chatgpt.com/backend-api/conversation*` | `*MakerSuiteService/GenerateContent*` | `*/completion*` (Matches Claude's streaming endpoint) |
| `includeThinkingInMessage` | `true` | `false` | `false` |
| Function calling toggle | N/A | `ENABLE_AISTUDIO_FUNCTION_CALLING` (with polling logic) | `ENABLE_CLAUDE_FUNCTION_CALLING` (logic commented out) |
---
## 2. Provider Identity
* **Name**:
* ChatGPT: `ChatGptProvider`
* AI Studio: `AIStudioProvider`
* Claude: `ClaudeProvider`
* **Supported Domains**:
* ChatGPT: `["chatgpt.com"]`
* AI Studio: `["aistudio.google.com"]`
* Claude: `["claude.ai"]`
---
## 3. Selectors & UI Interactions
| Selector Type | ChatGPT (`chatgpt.js`) | AI Studio (`aistudio.js`) | Claude (`claude.js`) |
| ------------------------ | ------------------------------------ | ------------------------------------------------------------ | -------------------------------------------------------- |
| Input Field | `#prompt-textarea` | `textarea.textarea`, `textarea.gmat-body-medium`, etc. | `div.ProseMirror[contenteditable="true"]` |
| Send Button | `button[data-testid="send-button"]` | `button.run-button`, `button[aria-label="Run"]`, etc. | `button[aria-label="Send message"]` |
| Response Capture (DOM) | `.message-bubble .text-content` | `.response-container`, `.model-response`, `.cmark-node`, ... | `.response-container`, `.response-text`, `.model-response`, ... |
| Thinking Indicator (DOM) | `.loading-spinner`, `.thinking-dots` | `.thinking-indicator`, `.loading-indicator`, etc. | `.thinking-indicator`, `.loading-indicator`, ... |
---
## 4. Message Sending Logic
* **ChatGptProvider**:
* Supports string, Blob, and array payloads.
* Sets `innerText` on a contenteditable div, uses `ClipboardEvent('paste')` for images.
* Retries clicking send up to 5 times with exponential backoff.
* **AIStudioProvider**:
* Handles similar payload types but pastes via `inputField.value` and paste event.
* Retries send click up to 60 attempts (5-minute total), triggering UI events to enable button.
* **ClaudeProvider**:
* Supports string, Blob, and array (text/image_url) payloads.
* Sets `textContent` on a contenteditable div, uses `ClipboardEvent('paste')` for images.
* Retries clicking send button if initially disabled.
---
## 5. Response Capture Mechanisms
* **Debugger-Based Streaming**:
* **ChatGPT**: Complex SSE parsing supporting thoughts, reasoning recaps, JSON patches, and OpenAI deltas.
* **AI Studio**: Simplified JSON array parsing with `extractTextSegments` and `findEndOfUnitMarker`, plus `includeThinkingInMessage` toggle.
* **Claude**: Parses SSE stream, looking for `content_block_delta` for text, and `message_stop` or `message_delta` with `stop_reason` for end-of-message.
* **DOM Fallback**:
* Both implement DOM monitoring, but selectors and timing differ:
* ChatGPT polls every 500ms, with stability checks and cleanup on final.
* AI Studio polls every 1s up to 15s, with fallback search through multiple DOM patterns.
* Claude has similar DOM fallback polling logic.
---
## 6. Registration
All three providers register themselves via:
```js
window.providerUtils.registerProvider(
providerInstance.name,
providerInstance.supportedDomains,
providerInstance
);
```
---
**Summary of Key Differences**:
1. **Parsing complexity**:
* ChatGPT: Extensive SSE parsing for multiple content types.
* AI Studio: Simpler array-based JSON parsing.
* Claude: SSE parsing focused on `content_block_delta`, `message_stop`, and `message_delta` with `stop_reason`.
2. **Thinking inclusion**:
* ChatGPT: Merges thoughts and content by default.
* AI Studio & Claude: Omit thoughts by default (`includeThinkingInMessage: false`).
3. **UI selectors & Features**:
* AI Studio: Diverse selectors for its SPA, plus (previously active) function calling toggle logic.
* Claude: Uses contenteditable div for input; function calling logic commented out.
4. **Retry strategy**:
* ChatGPT: 5-attempt loop for send.
* AI Studio: Longer retry window (up to 5 minutes) for send.
* Claude: Retries send click if button is initially disabled.
This should give a clear side-by-side comparison of their design choices and implementations.

59
docs/provider-utils.md Normal file
View file

@ -0,0 +1,59 @@
# Provider Utils Architecture (`provider-utils.js`)
This document outlines the purpose and structure of [`provider-utils.js`](extension/providers/provider-utils.js:1), which manages provider registration and dynamic lookup based on domain.
---
## 🧩 Overview
This module is injected into the global `window` object as `window.providerUtils` and offers two core functions:
- [`registerProvider()`](extension/providers/provider-utils.js:7): Registers a provider instance with one or more domains.
- [`detectProvider()`](extension/providers/provider-utils.js:19): Looks up a provider instance based on the current hostname.
---
## 🌍 Provider Registry
Internally, the registry is held in:
```js
const providerMap = {}; // domain -> { name, instance }
```
Providers are registered like:
```js
registerProvider("AIStudioProvider", ["aistudio.google.com"], new AIStudioProvider());
```
This allows matching providers to be reused across multiple domains if necessary.
---
## 🔍 Provider Detection
The function [`detectProvider(hostname)`](extension/providers/provider-utils.js:19) performs a partial match against the registered `domainKey`s to determine the best match.
If no match is found, it returns `null` and logs the result.
---
## 🔐 Error Handling
- Validates types of all registration arguments.
- Logs malformed or missing hostnames during detection.
- Silently fails for misconfiguration, aiding fault tolerance.
---
## 🧪 Debug Logging
- Logs the entire provider map for visibility on each detection.
- Confirms successful matches and domain checks.
---
## ✅ Summary
[`provider-utils.js`](extension/providers/provider-utils.js:1) provides a lightweight and dynamic mechanism for associating hostnames with provider implementations. It ensures extensibility for future integrations while remaining simple and debuggable.

186
docs/server-architecture.md Normal file
View file

@ -0,0 +1,186 @@
# Server Architecture Overview
This document outlines the architecture and operation of the WebSocket relay server found in [`api-relay-server/src/server.ts`](api-relay-server/src/server.ts). The system acts as an intermediary between browser extensions and an HTTP-based API, routing JSON messages bi-directionally, and managing request flow to prevent overloading the browser extension.
---
## 🌐 Server Layers
```mermaid
graph TD
subgraph HTTP Layer
A[Express App] --> B{/v1/chat/completions}
end
subgraph Request Handling Logic
B -- Request --> C{Extension Busy?}
C -- No --> D[processRequest()]
C -- Yes --> E{Behavior: Drop or Queue?}
E -- Drop --> F[Respond 429]
E -- Queue --> G[requestQueue]
G -- Dequeue --> D
end
subgraph WebSocket Communication
D -- SEND_CHAT_MESSAGE --> H[activeConnections[0]]
H -- CHAT_RESPONSE_* --> I[WebSocket Server]
I -- Resolve/Reject --> D
end
D --> J[pendingRequests Map]
D --> K[finishProcessingRequest()]
subgraph Admin UI & Config
L[Express App] --> M[/admin & /v1/admin/*]
M <--> N[server-config.json]
end
```
---
## 📁 Core File
- [`server.ts`](api-relay-server/src/server.ts): Main file where the entire Express server, WebSocket infrastructure, request queuing, and admin interface logic is defined.
---
## 🧩 Components
### 1. Express HTTP API
- **`/v1/chat/completions`**: Accepts OpenAI-compatible requests.
- Implements logic to check if a browser extension is busy (via `activeExtensionProcessingId`).
- Based on `newRequestBehavior` setting ('queue' or 'drop'):
- **Queue**: Adds incoming request to `requestQueue` if extension is busy. The HTTP response is deferred.
- **Drop**: Responds with 429 Too Many Requests if extension is busy.
- If extension is free, directly calls `processRequest()`.
- **`/v1/admin/server-info`**: Provides current server status and configuration, including `port`, `requestTimeoutMs`, and `newRequestBehavior`.
- **`/v1/admin/update-settings`**: Allows updating `port`, `requestTimeoutMs`, and `newRequestBehavior`. Changes are saved to `server-config.json`.
- **`/v1/admin/message-history`**: Retrieves recent message logs for the admin UI.
- **`/v1/admin/restart-server`**: Triggers a server restart.
- **`/admin`**: Serves the admin HTML interface.
- **`/health`**: Basic health check.
### 2. WebSocket Server
- [`WebSocketServer`](api-relay-server/src/server.ts:146): Accepts WebSocket connections from browser extensions.
- [`activeConnections`](api-relay-server/src/server.ts:43): Array storing active WebSocket client connections. Currently, only the first connection (`activeConnections[0]`) is used for sending messages.
- **Message Handling**: Receives messages from the extension (e.g., `CHAT_RESPONSE`, `CHAT_RESPONSE_CHUNK`, `CHAT_RESPONSE_ERROR`) and resolves or rejects promises in the `pendingRequests` map.
### 3. Queuing & Processing System
- **`activeExtensionProcessingId: number | null`**: Tracks the `requestId` of the message currently being processed by the extension. If `null`, the extension is considered free.
- **`newRequestBehavior: 'queue' | 'drop'`**: Global variable determining how to handle new requests when the extension is busy. Loaded from `server-config.json` (defaults to 'queue').
- **`requestQueue: QueuedRequest[]`**: An in-memory array holding `QueuedRequest` objects when `newRequestBehavior` is 'queue' and the extension is busy.
- **`QueuedRequest` Interface**: Defines the structure for storing an original HTTP request (`req`, `res`) and its parameters, to be processed later.
- **`async function processRequest(queuedItem: QueuedRequest)`**:
- Sets `activeExtensionProcessingId` to the current `queuedItem.requestId`.
- Logs `CHAT_REQUEST_PROCESSING`.
- Sends the `SEND_CHAT_MESSAGE` to the extension via WebSocket.
- Manages a `Promise` in `pendingRequests` for the response, including a timeout (`currentRequestTimeoutMs`).
- On response/error/timeout, formats and sends the HTTP response using the stored `queuedItem.res`.
- Calls `finishProcessingRequest()` in a `finally` block.
- **`function finishProcessingRequest(completedRequestId: number)`**:
- Clears `activeExtensionProcessingId`.
- Removes the request from `pendingRequests`.
- If `newRequestBehavior` is 'queue' and `requestQueue` is not empty, dequeues the next request and calls `processRequest()` for it.
### 4. State Management
- [`pendingRequests`](api-relay-server/src/server.ts:44): A `Map` that stores `Promise` resolve/reject handlers, keyed by `requestId`. Used by `processRequest` to await responses from the WebSocket.
- [`requestCounter`](api-relay-server/src/server.ts:45): Generates unique `requestId`s.
- [`adminMessageHistory`](api-relay-server/src/server.ts:90): In-memory store for admin log entries.
---
## 🔄 Lifecycle Flow (with Queuing)
```mermaid
sequenceDiagram
participant User
participant ServerAPI
participant RequestLogic
participant ProcessRequestFunc
participant Extension
participant RequestQueue
User->>ServerAPI: POST /v1/chat/completions (req1)
ServerAPI->>RequestLogic: Handle req1
alt Extension is Free
RequestLogic->>ProcessRequestFunc: processRequest(req1)
ProcessRequestFunc->>Extension: SEND_CHAT_MESSAGE (req1)
Note over ProcessRequestFunc,Extension: activeExtensionProcessingId = req1.id
User->>ServerAPI: POST /v1/chat/completions (req2)
ServerAPI->>RequestLogic: Handle req2
RequestLogic->>RequestLogic: Extension Busy (req1.id)
alt newRequestBehavior == 'queue'
RequestLogic->>RequestQueue: Enqueue req2
Note over RequestLogic: HTTP Response for req2 deferred
else newRequestBehavior == 'drop'
RequestLogic-->>ServerAPI: Respond 429 for req2
ServerAPI-->>User: HTTP 429 (req2 dropped)
end
Extension-->>ProcessRequestFunc: CHAT_RESPONSE (req1)
ProcessRequestFunc-->>ServerAPI: Respond HTTP OK (req1)
ServerAPI-->>User: HTTP OK (req1)
ProcessRequestFunc->>RequestLogic: finishProcessingRequest(req1.id)
RequestLogic->>RequestLogic: activeExtensionProcessingId = null
alt Queue Not Empty and Behavior is 'queue'
RequestLogic->>RequestQueue: Dequeue req2
RequestLogic->>ProcessRequestFunc: processRequest(req2)
ProcessRequestFunc->>Extension: SEND_CHAT_MESSAGE (req2)
Note over ProcessRequestFunc,Extension: activeExtensionProcessingId = req2.id
Extension-->>ProcessRequestFunc: CHAT_RESPONSE (req2)
ProcessRequestFunc-->>ServerAPI: Respond HTTP OK (req2 via stored res)
ServerAPI-->>User: HTTP OK (req2)
ProcessRequestFunc->>RequestLogic: finishProcessingRequest(req2.id)
end
else Extension is Busy (initial state)
RequestLogic->>RequestLogic: Extension Busy
alt newRequestBehavior == 'queue'
RequestLogic->>RequestQueue: Enqueue req1
else newRequestBehavior == 'drop'
RequestLogic-->>ServerAPI: Respond 429 for req1
ServerAPI-->>User: HTTP 429 (req1 dropped)
end
end
```
---
## 🛡️ Error Handling
- If no browser extension is connected when a request arrives: Server responds with `503 Service Unavailable`.
- If no browser extension is connected when `processRequest` attempts to send a message (e.g., after being dequeued): The request is failed, and an error is sent to the original client if headers not already sent.
- If `newRequestBehavior` is 'drop' and the extension is busy: Server responds with `429 Too Many Requests`.
- Request Timeout: Each request processed by `processRequest` has a timeout (`currentRequestTimeoutMs`, configurable). If the extension doesn't respond in time, the promise is rejected, and an error is sent to the client.
- Errors from extension (`CHAT_RESPONSE_ERROR`): Logged, and the corresponding request promise is rejected, leading to an error response to the client.
---
## ⚙️ Configuration
The server's behavior can be configured via `server-config.json` located in the `dist` directory (created/managed by `server.ts`). The Admin UI also allows viewing and modifying these settings.
Key configurable options:
- **`port`**: The port on which the server listens. Requires server restart.
- **`requestTimeoutMs`**: Timeout in milliseconds for waiting for a response from the browser extension. Effective immediately.
- **`newRequestBehavior`**: Determines how new requests are handled if the extension is busy. Can be:
- `'queue'` (default): New requests are queued and processed sequentially.
- `'drop'`: New requests are rejected with a 429 error.
Effective immediately.
---
## 🔌 Connection Monitoring
- The server maintains an array of `activeConnections`.
- WebSocket connections have built-in ping/pong mechanisms for keep-alive, managed by the `ws` library. Explicit server-side ping logic is not currently implemented in `server.ts`.
- Disconnected clients are removed from `activeConnections`.
- `pendingRequests` are cleared on timeout or when a request completes (successfully or with an error) via `finishProcessingRequest`.
---
## ✅ Summary
This architecture creates a decoupled, resilient relay system. The new queuing/dropping mechanism ensures that the browser extension processes only one message at a time, preventing race conditions and allowing for configurable behavior when the extension is busy. The Admin UI provides visibility and control over key operational parameters.

1076
docs/user-manual.md Normal file

File diff suppressed because it is too large Load diff

1103
extension/background.js Normal file

File diff suppressed because it is too large Load diff

886
extension/content.js Normal file
View file

@ -0,0 +1,886 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
// AI Chat Relay - Content Script
// Prefix for console logs
const CS_LOG_PREFIX = '[CS CONTENT]';
console.log(CS_LOG_PREFIX, "Content Script Injected & Loaded");
// Global state
let provider = null; // This will be set by initializeContentRelay
let setupComplete = false;
let currentRequestId = null;
let processingMessage = false; // Flag to track if we're currently processing a message
let responseMonitoringTimers = []; // Keep track of all monitoring timers
let captureAttempts = 0; // Track how many capture attempts we've made
const MAX_CAPTURE_ATTEMPTS = 30; // Maximum number of capture attempts
const CAPTURE_DELAY = 1000; // 1 second between capture attempts
// Helper function to find potential input fields and buttons
function findPotentialSelectors() {
console.log(CS_LOG_PREFIX, "Searching for potential input fields and buttons...");
// Find all textareas
const textareas = document.querySelectorAll('textarea');
console.log(CS_LOG_PREFIX, "Found textareas:", textareas.length);
textareas.forEach((textarea, index) => {
console.log(CS_LOG_PREFIX, `Textarea ${index}:`, {
id: textarea.id,
className: textarea.className,
ariaLabel: textarea.getAttribute('aria-label'),
placeholder: textarea.getAttribute('placeholder'),
name: textarea.name
});
});
// Find all input fields
const inputs = document.querySelectorAll('input[type="text"]');
console.log(CS_LOG_PREFIX, "Found text inputs:", inputs.length);
inputs.forEach((input, index) => {
console.log(CS_LOG_PREFIX, `Input ${index}:`, {
id: input.id,
className: input.className,
ariaLabel: input.getAttribute('aria-label'),
placeholder: input.getAttribute('placeholder'),
name: input.name
});
});
// Find all buttons
const buttons = document.querySelectorAll('button');
console.log(CS_LOG_PREFIX, "Found buttons:", buttons.length);
buttons.forEach((button, index) => {
console.log(CS_LOG_PREFIX, `Button ${index}:`, {
id: button.id,
className: button.className,
ariaLabel: button.getAttribute('aria-label'),
textContent: button.textContent.trim()
});
});
}
function initializeContentRelay() {
if (setupComplete) {
console.log(CS_LOG_PREFIX, "Initialization already attempted or complete.");
return;
}
console.log(CS_LOG_PREFIX, 'Initializing content relay...');
// Provider Detection
if (window.providerUtils) {
const detectedProvider = window.providerUtils.detectProvider(window.location.hostname); // New detection method
provider = detectedProvider; // Update the global provider instance
console.log(CS_LOG_PREFIX, 'Detected provider:', provider ? provider.name : 'None');
if (provider && typeof provider.getStreamingApiPatterns === 'function') {
const patternsFromProvider = provider.getStreamingApiPatterns();
console.log(CS_LOG_PREFIX, 'Retrieved patterns from provider:', patternsFromProvider);
if (patternsFromProvider && patternsFromProvider.length > 0) {
chrome.runtime.sendMessage({
type: "SET_DEBUGGER_TARGETS",
providerName: provider.name,
patterns: patternsFromProvider
}, response => {
if (chrome.runtime.lastError) {
console.error(CS_LOG_PREFIX, 'Error sending SET_DEBUGGER_TARGETS:', chrome.runtime.lastError.message);
} else {
console.log(CS_LOG_PREFIX, 'SET_DEBUGGER_TARGETS message sent, response:', response);
}
});
} else {
console.log(CS_LOG_PREFIX, 'No patterns returned by provider or patterns array is empty.');
}
} else {
if (provider) {
console.log(CS_LOG_PREFIX, `Provider '${provider.name}' found, but getStreamingApiPatterns method is missing or not a function.`);
} else {
console.log(CS_LOG_PREFIX, 'No current provider instance found to get patterns from.');
}
}
} else {
console.error(CS_LOG_PREFIX, 'providerUtils not found. Cannot detect provider or send patterns.');
}
// Send CHAT_RELAY_READY (always, after attempting provider setup)
chrome.runtime.sendMessage({
type: "CHAT_RELAY_READY",
chatInterface: provider ? provider.name : "unknown" // Add provider name
}, response => {
if (chrome.runtime.lastError) {
console.error(CS_LOG_PREFIX, 'Error sending CHAT_RELAY_READY:', chrome.runtime.lastError.message);
} else {
console.log(CS_LOG_PREFIX, 'CHAT_RELAY_READY message sent, response:', response);
}
});
// Setup message listeners (will be called later, once, via setupMessageListeners)
// If a provider is detected, proceed with provider-specific setup after a delay
if (provider) {
console.log(CS_LOG_PREFIX, `Proceeding with provider-specific setup for: ${provider.name}`);
setTimeout(() => {
// Double check setupComplete flag in case of async issues or rapid calls, though less likely here.
if (!setupComplete) {
findPotentialSelectors();
setupAutomaticResponseCapture();
startElementPolling();
console.log(CS_LOG_PREFIX, "Provider-specific DOM setup (response capture, polling) initiated after delay.");
}
}, 2000); // Delay to allow page elements to fully render
} else {
console.warn(CS_LOG_PREFIX, "No provider detected. Some provider-specific features (response capture, element polling) will not be initialized.");
}
setupComplete = true;
console.log(CS_LOG_PREFIX, "Content relay initialization sequence finished.");
}
// Poll for elements that might be loaded dynamically
function startElementPolling() {
if (!provider) {
console.warn(CS_LOG_PREFIX, "Cannot start element polling: no provider detected.");
return;
}
console.log(CS_LOG_PREFIX, "Starting element polling...");
// Check every 2 seconds for the input field and send button
const pollingInterval = setInterval(() => {
if (!provider) { // Provider might have been lost or was never there
clearInterval(pollingInterval);
console.warn(CS_LOG_PREFIX, "Stopping element polling: provider became unavailable.");
return;
}
const inputField = document.querySelector(provider.inputSelector);
const sendButton = document.querySelector(provider.sendButtonSelector);
if (inputField) {
console.log(CS_LOG_PREFIX, "Found input field:", inputField);
}
if (sendButton) {
console.log(CS_LOG_PREFIX, "Found send button:", sendButton);
}
if (inputField && sendButton) {
console.log(CS_LOG_PREFIX, "Found all required elements, stopping polling");
clearInterval(pollingInterval);
}
}, 2000);
}
// Function to send a message to the chat interface
function sendChatMessage(text) {
if (!provider) {
console.error(CS_LOG_PREFIX, "Cannot send chat message: No provider configured.");
processingMessage = false; // Reset flag
return false;
}
// Try to send the message with retries
return sendChatMessageWithRetry(text, 5); // Try up to 5 times
}
// Helper function to send a message with retries
function sendChatMessageWithRetry(text, maxRetries, currentRetry = 0) {
if (!provider) {
console.error(CS_LOG_PREFIX, `Cannot send chat message with retry (attempt ${currentRetry + 1}/${maxRetries}): No provider.`);
processingMessage = false;
return false;
}
try {
const inputField = document.querySelector(provider.inputSelector);
if (!inputField) {
console.log(CS_LOG_PREFIX, `Could not find input field (attempt ${currentRetry + 1}/${maxRetries})`);
if (currentRetry < maxRetries - 1) {
console.log(CS_LOG_PREFIX, `Retrying in 1 second...`);
setTimeout(() => {
sendChatMessageWithRetry(text, maxRetries, currentRetry + 1);
}, 1000);
return true;
}
console.error(CS_LOG_PREFIX, "Could not find input field after all retries");
processingMessage = false;
return false;
}
const sendButton = document.querySelector(provider.sendButtonSelector);
if (!sendButton) {
console.log(CS_LOG_PREFIX, `Could not find send button (attempt ${currentRetry + 1}/${maxRetries})`);
if (currentRetry < maxRetries - 1) {
console.log(CS_LOG_PREFIX, `Retrying in 1 second...`);
setTimeout(() => {
sendChatMessageWithRetry(text, maxRetries, currentRetry + 1);
}, 1000);
return true;
}
console.error(CS_LOG_PREFIX, "Could not find send button after all retries");
processingMessage = false;
return false;
}
const result = provider.sendChatMessage(text, inputField, sendButton);
if (result) {
console.log(CS_LOG_PREFIX, "Message sent successfully via provider.");
if (provider.shouldSkipResponseMonitoring && provider.shouldSkipResponseMonitoring()) {
console.log(CS_LOG_PREFIX, `Provider ${provider.name} has requested to skip response monitoring.`);
processingMessage = false; // Message sent, no monitoring, so reset.
} else {
console.log(CS_LOG_PREFIX, `Waiting ${CAPTURE_DELAY/1000} seconds before starting to monitor for responses...`);
const timer = setTimeout(() => {
console.log(CS_LOG_PREFIX, "Starting to monitor for responses now");
startMonitoringForResponse();
}, CAPTURE_DELAY);
responseMonitoringTimers.push(timer);
}
} else {
console.error(CS_LOG_PREFIX, "Provider reported failure sending message.");
processingMessage = false; // Reset on failure
}
return result;
} catch (error) {
console.error(CS_LOG_PREFIX, "Error sending message:", error);
if (currentRetry < maxRetries - 1) {
console.log(CS_LOG_PREFIX, `Error occurred, retrying in 1 second... (attempt ${currentRetry + 1}/${maxRetries})`);
setTimeout(() => {
sendChatMessageWithRetry(text, maxRetries, currentRetry + 1);
}, 1000);
return true;
}
processingMessage = false;
return false;
}
}
// Function to start monitoring for a response
function startMonitoringForResponse() {
if (!provider || !provider.responseSelector || !provider.getResponseText) {
console.error(CS_LOG_PREFIX, "Cannot monitor for response: Provider or necessary provider methods/selectors are not configured.");
processingMessage = false; // Can't monitor, so reset.
return;
}
console.log(CS_LOG_PREFIX, "Starting response monitoring process...");
captureAttempts = 0; // Reset capture attempts for this new monitoring session
const attemptCapture = () => {
if (!processingMessage && currentRequestId === null) {
console.log(CS_LOG_PREFIX, "Response monitoring stopped because processingMessage is false and currentRequestId is null (likely request completed or cancelled).");
return; // Stop if no longer processing a message
}
if (captureAttempts >= MAX_CAPTURE_ATTEMPTS) {
console.error(CS_LOG_PREFIX, "Maximum response capture attempts reached. Stopping monitoring.");
// Send a timeout/error message back to the background script
if (currentRequestId !== null) { // Ensure there's a request ID to report error for
chrome.runtime.sendMessage({
type: "FINAL_RESPONSE_TO_RELAY",
requestId: currentRequestId,
error: "Response capture timed out in content script.",
isFinal: true // Treat as final to unblock server
}, response => {
if (chrome.runtime.lastError) {
console.error(CS_LOG_PREFIX, 'Error sending capture timeout error:', chrome.runtime.lastError.message);
} else {
console.log(CS_LOG_PREFIX, 'Capture timeout error sent to background, response:', response);
}
});
}
processingMessage = false;
currentRequestId = null; // Clear current request ID as it timed out
return;
}
captureAttempts++;
console.log(CS_LOG_PREFIX, `Response capture attempt ${captureAttempts}/${MAX_CAPTURE_ATTEMPTS}`);
const responseElement = document.querySelector(provider.responseSelector);
if (responseElement) {
const responseText = provider.getResponseText(responseElement);
const isFinal = provider.isResponseComplete ? provider.isResponseComplete(responseElement) : false; // Default to false if not implemented
console.log(CS_LOG_PREFIX, `Captured response text (length: ${responseText.length}), isFinal: ${isFinal}`);
// Send to background
chrome.runtime.sendMessage({
type: "FINAL_RESPONSE_TO_RELAY", // Or a new type like "PARTIAL_RESPONSE" if needed
requestId: currentRequestId,
text: responseText,
isFinal: isFinal
}, response => {
if (chrome.runtime.lastError) {
console.error(CS_LOG_PREFIX, 'Error sending response data to background:', chrome.runtime.lastError.message);
} else {
console.log(CS_LOG_PREFIX, 'Response data sent to background, response:', response);
}
});
if (isFinal) {
console.log(CS_LOG_PREFIX, "Final response detected. Stopping monitoring.");
processingMessage = false; // Reset flag as processing is complete
// currentRequestId will be cleared by handleProviderResponse or if a new message comes
return;
}
} else {
console.log(CS_LOG_PREFIX, "Response element not found yet.");
}
// Continue polling
const timer = setTimeout(attemptCapture, CAPTURE_DELAY);
responseMonitoringTimers.push(timer);
};
// Initial call to start the process
attemptCapture();
}
// Function to set up automatic response capture using MutationObserver
function setupAutomaticResponseCapture() {
if (!provider || !provider.responseContainerSelector || typeof provider.handleMutation !== 'function') {
console.warn(CS_LOG_PREFIX, "Cannot set up automatic response capture: Provider or necessary provider methods/selectors are not configured.");
return;
}
console.log(CS_LOG_PREFIX, "Setting up MutationObserver for automatic response capture on selector:", provider.responseContainerSelector);
const targetNode = document.querySelector(provider.responseContainerSelector);
if (!targetNode) {
console.warn(CS_LOG_PREFIX, `Response container element ('${provider.responseContainerSelector}') not found. MutationObserver not started. Will rely on polling or debugger.`);
// Optionally, retry finding the targetNode after a delay, or fall back to polling exclusively.
// For now, we just warn and don't start the observer.
return;
}
const config = { childList: true, subtree: true, characterData: true };
const callback = (mutationsList, observer) => {
// If not processing a message, or no current request, don't do anything.
// This check is crucial to prevent processing mutations when not expected.
if (!processingMessage || currentRequestId === null) {
// console.log(CS_LOG_PREFIX, "MutationObserver: Ignoring mutation, not actively processing a message or no currentRequestId.");
return;
}
// Let the provider handle the mutation and decide if it's relevant
// The provider's handleMutation should call handleProviderResponse with the requestId
try {
provider.handleMutation(mutationsList, observer, currentRequestId, handleProviderResponse);
} catch (e) {
console.error(CS_LOG_PREFIX, "Error in provider.handleMutation:", e);
}
};
const observer = new MutationObserver(callback);
try {
observer.observe(targetNode, config);
console.log(CS_LOG_PREFIX, "MutationObserver started on:", targetNode);
} catch (e) {
console.error(CS_LOG_PREFIX, "Failed to start MutationObserver:", e, "on target:", targetNode);
// Fallback or error handling if observer cannot be started
}
// Store the observer if we need to disconnect it later
// e.g., window.chatRelayObserver = observer;
}
// Function to monitor for the completion of a response (e.g., when a "thinking" indicator disappears)
// This is a more generic version, specific providers might have more tailored logic.
function monitorResponseCompletion(element) {
if (!provider || !provider.thinkingIndicatorSelector) {
console.warn(CS_LOG_PREFIX, "Cannot monitor response completion: No thinkingIndicatorSelector in provider.");
return;
}
const thinkingIndicator = document.querySelector(provider.thinkingIndicatorSelector);
if (!thinkingIndicator) {
// If the indicator is already gone, assume completion or it never appeared.
// Provider's getResponseText should ideally capture the full text.
console.log(CS_LOG_PREFIX, "Thinking indicator not found, assuming response is complete or was never present.");
// Potentially call captureResponse one last time if needed by provider logic
// captureResponse(null, true); // Example, might need adjustment
return;
}
console.log(CS_LOG_PREFIX, "Thinking indicator found. Monitoring for its removal...");
const observer = new MutationObserver((mutationsList, obs) => {
// Check if the thinking indicator (or its parent, if it's removed directly) is no longer in the DOM
// or if a specific class/attribute indicating completion appears.
// This logic needs to be robust and provider-specific.
// A simple check: if the element itself is removed or a known parent.
// More complex checks might involve looking for specific classes on the response element.
if (!document.body.contains(thinkingIndicator)) {
console.log(CS_LOG_PREFIX, "Thinking indicator removed. Assuming response completion.");
obs.disconnect();
// Capture the final response
// This assumes captureResponse can get the full text now.
// The 'true' flag indicates this is considered the final capture.
captureResponse(null, true);
}
// Add other provider-specific checks here if needed
});
// Observe the parent of the thinking indicator for changes in its children (e.g., removal of the indicator)
// Or observe attributes of the indicator itself if it changes state instead of being removed.
if (thinkingIndicator.parentNode) {
observer.observe(thinkingIndicator.parentNode, { childList: true, subtree: true });
} else {
console.warn(CS_LOG_PREFIX, "Thinking indicator has no parent node to observe. Cannot monitor for removal effectively.");
}
}
// Specific monitoring for Gemini, if needed (example)
function monitorGeminiResponse(element) {
// Gemini specific logic for monitoring response element for completion
// This might involve looking for specific attributes or child elements
// that indicate the stream has finished.
console.log(CS_LOG_PREFIX, "Monitoring Gemini response element:", element);
// Example: Observe for a specific class or attribute change
const observer = new MutationObserver((mutationsList, obs) => {
let isComplete = false;
// Check mutations for signs of completion based on Gemini's DOM structure
// For instance, a "generating" class is removed, or a "complete" attribute is set.
// This is highly dependent on the actual Gemini interface.
// Example (conceptual):
// if (element.classList.contains('response-complete')) {
// isComplete = true;
// }
if (isComplete) {
console.log(CS_LOG_PREFIX, "Gemini response detected as complete by mutation.");
obs.disconnect();
captureResponse(element, true); // Capture final response
}
});
observer.observe(element, { attributes: true, childList: true, subtree: true });
console.log(CS_LOG_PREFIX, "Gemini response observer started.");
}
function monitorGeminiContentStability(element) {
let lastContent = "";
let stableCount = 0;
const STABLE_THRESHOLD = 3; // Number of intervals content must remain unchanged
const CHECK_INTERVAL = 300; // Milliseconds
console.log(CS_LOG_PREFIX, "Starting Gemini content stability monitoring for element:", element);
const intervalId = setInterval(() => {
if (!processingMessage || currentRequestId === null) {
console.log(CS_LOG_PREFIX, "Gemini stability: Stopping, no longer processing message.");
clearInterval(intervalId);
return;
}
const currentContent = provider.getResponseText(element);
if (currentContent === lastContent) {
stableCount++;
console.log(CS_LOG_PREFIX, `Gemini stability: Content stable, count: ${stableCount}`);
} else {
lastContent = currentContent;
stableCount = 0; // Reset if content changes
console.log(CS_LOG_PREFIX, `Gemini stability: Content changed. New length: ${currentContent.length}`);
// Send partial update if provider wants it
if (provider.sendPartialUpdates) {
handleProviderResponse(currentRequestId, currentContent, false);
}
}
if (stableCount >= STABLE_THRESHOLD) {
console.log(CS_LOG_PREFIX, "Gemini stability: Content stable for threshold. Assuming final.");
clearInterval(intervalId);
// Ensure the very latest content is captured and sent as final
const finalContent = provider.getResponseText(element);
handleProviderResponse(currentRequestId, finalContent, true);
}
}, CHECK_INTERVAL);
responseMonitoringTimers.push(intervalId); // Store to clear if needed
}
// Function to capture the response text
// potentialTurnElement is passed by some providers (like Gemini) if they identify the specific response "turn" element
function captureResponse(potentialTurnElement = null, isFinal = false) {
if (!provider || !provider.getResponseText) {
console.error(CS_LOG_PREFIX, "Cannot capture response: No provider or getResponseText method.");
if (currentRequestId !== null) {
handleProviderResponse(currentRequestId, "Error: Provider misconfiguration for response capture.", true);
}
return;
}
// Use the potentialTurnElement if provided and valid, otherwise fall back to provider.responseSelector
let responseElement = null;
if (potentialTurnElement && typeof potentialTurnElement === 'object' && potentialTurnElement.nodeType === 1) {
responseElement = potentialTurnElement;
console.log(CS_LOG_PREFIX, "Using provided potentialTurnElement for capture:", responseElement);
} else {
if (!provider.responseSelector) {
console.error(CS_LOG_PREFIX, "Cannot capture response: No responseSelector in provider and no valid potentialTurnElement given.");
if (currentRequestId !== null) {
handleProviderResponse(currentRequestId, "Error: Provider responseSelector missing.", true);
}
return;
}
responseElement = document.querySelector(provider.responseSelector);
console.log(CS_LOG_PREFIX, "Using provider.responseSelector for capture:", provider.responseSelector);
}
if (!responseElement) {
console.warn(CS_LOG_PREFIX, "Response element not found during capture.");
// If it's supposed to be final and element is not found, it might be an issue.
if (isFinal && currentRequestId !== null) {
handleProviderResponse(currentRequestId, "Error: Response element not found for final capture.", true);
}
return;
}
const responseText = provider.getResponseText(responseElement);
// isFinal flag is now passed as an argument, but provider might have its own check
const trulyFinal = isFinal || (provider.isResponseComplete ? provider.isResponseComplete(responseElement) : false);
console.log(CS_LOG_PREFIX, `Captured response (length: ${responseText.length}), isFinal: ${trulyFinal}. Passed isFinal: ${isFinal}`);
if (currentRequestId === null) {
console.warn(CS_LOG_PREFIX, "captureResponse: currentRequestId is null. Cannot send response to background.");
return;
}
// Call handleProviderResponse, which will then relay to background
// This centralizes the logic for sending FINAL_RESPONSE_TO_RELAY
handleProviderResponse(currentRequestId, responseText, trulyFinal);
}
// Function to clear all active response monitoring timers
function clearResponseMonitoringTimers() {
console.log(CS_LOG_PREFIX, `Clearing ${responseMonitoringTimers.length} response monitoring timers.`);
responseMonitoringTimers.forEach(timerId => clearTimeout(timerId)); // Works for both setTimeout and setInterval IDs
responseMonitoringTimers = []; // Reset the array
}
// Define message listener function *before* calling it
// Renamed setupAutomaticMessageSending to setupMessageListeners
function setupMessageListeners() { // Renamed from setupAutomaticMessageSending
// Listen for commands from the background script
chrome.runtime.onMessage.addListener((message, sender, sendResponse) => {
if (message.action === "SEND_CHAT_MESSAGE") {
const messageContent = message.messageContent; // Use messageContent
let messagePreview = "";
if (typeof messageContent === 'string') {
messagePreview = `String: "${messageContent.substring(0, 50)}..."`;
} else if (messageContent instanceof ArrayBuffer) {
messagePreview = `ArrayBuffer data (size: ${messageContent.byteLength} bytes)`;
} else if (messageContent instanceof Blob) {
messagePreview = `Blob data (size: ${messageContent.size} bytes, type: ${messageContent.type})`;
} else if (messageContent && typeof messageContent === 'object' && messageContent !== null) {
messagePreview = `Object data (type: ${Object.prototype.toString.call(messageContent)})`;
} else {
messagePreview = `Data type: ${typeof messageContent}, Value: ${String(messageContent).substring(0,50)}`;
}
console.log(CS_LOG_PREFIX, "Received command to send message:", messagePreview, "Request ID:", message.requestId, "Last Processed Text:", message.lastProcessedText ? `"${message.lastProcessedText.substring(0,50)}..."` : "null");
if (!provider) {
console.error(CS_LOG_PREFIX, "Cannot send message: No provider detected.");
sendResponse({ success: false, error: "No provider detected" });
return true;
}
// Superseding / duplicate requestId logic (unchanged)
if (processingMessage && currentRequestId !== null && currentRequestId !== message.requestId) {
console.warn(CS_LOG_PREFIX, `New message (requestId: ${message.requestId}) received while request ${currentRequestId} was processing. The new message will supersede the old one.`);
clearResponseMonitoringTimers();
processingMessage = false;
currentRequestId = null;
} else if (processingMessage && currentRequestId === message.requestId) {
console.warn(CS_LOG_PREFIX, `Received duplicate SEND_CHAT_MESSAGE for already processing requestId: ${message.requestId}. Ignoring duplicate command.`);
sendResponse({ success: false, error: "Duplicate command for already processing requestId."});
return true;
}
// Attempt to get the input field
const inputField = document.querySelector(provider.inputSelector);
let currentUIInputText = null;
if (inputField) {
currentUIInputText = inputField.value;
} else {
console.error(CS_LOG_PREFIX, "Input field not found via selector:", provider.inputSelector, "Cannot process SEND_CHAT_MESSAGE for requestId:", message.requestId);
// Reset state if this was meant to be the current request
if (currentRequestId === message.requestId) { // Check if we were about to set this as current
processingMessage = false; // Ensure it's reset if it was about to become active
// currentRequestId is not yet set to message.requestId here if it's a new command
}
sendResponse({ success: false, error: "Input field not found by content script." });
return true;
}
// Duplicate Message Scenario Check:
// 1. We have a record of the last processed text from the background script.
// 2. The server is trying to send that exact same text again (messageContent === message.lastProcessedText).
// 3. The UI input field also currently contains that exact same text (currentUIInputText === messageContent).
let isDuplicateMessageScenario = false;
if (typeof messageContent === 'string' && typeof message.lastProcessedText === 'string' && typeof currentUIInputText === 'string') {
isDuplicateMessageScenario = message.lastProcessedText &&
messageContent === message.lastProcessedText &&
currentUIInputText === messageContent;
}
if (isDuplicateMessageScenario) {
console.log(CS_LOG_PREFIX, `Duplicate message scenario detected for requestId: ${message.requestId}.`);
console.log(CS_LOG_PREFIX, ` Server wants to send: "${messageContent.substring(0, 50)}..."`);
console.log(CS_LOG_PREFIX, ` Last processed text was: "${message.lastProcessedText.substring(0, 50)}..."`);
console.log(CS_LOG_PREFIX, ` Current UI input is: "${currentUIInputText.substring(0, 50)}..."`);
console.log(CS_LOG_PREFIX, "Clearing input field and notifying background.");
inputField.value = ''; // Clear the input field
// Optionally, dispatch 'input' or 'change' events if the website needs them for reactivity
// inputField.dispatchEvent(new Event('input', { bubbles: true, cancelable: true }));
chrome.runtime.sendMessage({
type: "DUPLICATE_MESSAGE_HANDLED",
requestId: message.requestId,
originalText: messageContent // The text that was duplicated
}, response => {
if (chrome.runtime.lastError) {
console.error(CS_LOG_PREFIX, 'Error sending DUPLICATE_MESSAGE_HANDLED:', chrome.runtime.lastError.message);
} else {
console.log(CS_LOG_PREFIX, 'DUPLICATE_MESSAGE_HANDLED sent to background, response:', response);
}
});
// This request is now considered "handled" by the content script (as a duplicate).
// Reset content script's immediate processing state if this was about to become the active request.
// Note: currentRequestId might not yet be message.requestId if this is a brand new command.
// The background script will manage its own processingRequest flag based on DUPLICATE_MESSAGE_HANDLED.
// For content.js, we ensure we don't proceed to send this.
// If currentRequestId was already message.requestId (e.g. from a retry/glitch), reset it.
if (currentRequestId === message.requestId) {
processingMessage = false;
currentRequestId = null;
}
sendResponse({ success: true, message: "Duplicate message scenario handled by clearing input." });
return true;
}
// If not a duplicate, proceed with normal sending logic:
console.log(CS_LOG_PREFIX, `Not a duplicate scenario for requestId: ${message.requestId}. Proceeding to send.`);
processingMessage = true;
currentRequestId = message.requestId;
console.log(CS_LOG_PREFIX, `Set currentRequestId to ${currentRequestId} for processing.`);
if (provider && typeof provider.sendChatMessage === 'function') {
provider.sendChatMessage(messageContent, currentRequestId) // Pass messageContent and the requestId
.then(success => {
if (success) {
console.log(CS_LOG_PREFIX, `Message sending initiated successfully via provider for requestId: ${currentRequestId}.`);
if (provider.initiateResponseCapture && typeof provider.initiateResponseCapture === 'function') {
console.log(CS_LOG_PREFIX, `Calling provider.initiateResponseCapture for requestId: ${currentRequestId}`);
provider.initiateResponseCapture(currentRequestId, handleProviderResponse);
} else {
console.error(CS_LOG_PREFIX, `Provider ${provider.name} does not have initiateResponseCapture method. Response will not be processed for requestId ${currentRequestId}.`);
// If no response capture, this request might hang on the server side.
// Consider sending an error back to background.js or directly to server.
chrome.runtime.sendMessage({
type: "FINAL_RESPONSE_TO_RELAY",
requestId: currentRequestId,
error: `Provider ${provider.name} cannot capture responses. Message sent but no response will be relayed.`,
isFinal: true
});
processingMessage = false; // As we can't process response
currentRequestId = null;
}
sendResponse({ success: true, message: "Message sending initiated by provider." });
} else {
console.error(CS_LOG_PREFIX, `Provider failed to initiate sending message for requestId: ${currentRequestId}.`);
processingMessage = false;
currentRequestId = null;
sendResponse({ success: false, error: "Provider failed to send message." });
}
}).catch(error => {
console.error(CS_LOG_PREFIX, `Error during provider.sendChatMessage for requestId: ${currentRequestId}:`, error);
processingMessage = false;
currentRequestId = null;
sendResponse({ success: false, error: `Error sending message: ${error.message}` });
});
} else {
console.error(CS_LOG_PREFIX, "Provider or provider.sendChatMessage is not available for requestId:", message.requestId);
processingMessage = false;
currentRequestId = null; // Ensure reset if it was about to be set
sendResponse({ success: false, error: "Provider or sendChatMessage method missing." });
}
return true; // Indicate async response
} else if (message.type === "DEBUGGER_RESPONSE") {
console.log(CS_LOG_PREFIX, "Received DEBUGGER_RESPONSE message object:", JSON.stringify(message)); // Log full received message
console.log(CS_LOG_PREFIX, `Processing DEBUGGER_RESPONSE for app requestId: ${currentRequestId}. Debugger requestId: ${message.requestId}. Data length: ${message.data ? message.data.length : 'null'}`);
if (!provider) {
console.error(CS_LOG_PREFIX, "Received DEBUGGER_RESPONSE but no provider is active.");
sendResponse({ success: false, error: "No provider active." });
return true;
}
if (typeof provider.handleDebuggerData !== 'function') {
console.error(CS_LOG_PREFIX, `Provider ${provider.name} does not implement handleDebuggerData.`);
sendResponse({ success: false, error: `Provider ${provider.name} does not support debugger method.` });
return true;
}
// IMPORTANT: The message.requestId IS the application's original requestId,
// associated by background.js. We should use this directly.
// The content.js currentRequestId might have been cleared if the provider.sendChatMessage failed,
// but the debugger stream might still be valid for message.requestId.
if (!message.requestId && message.requestId !== 0) { // Check if message.requestId is missing or invalid (0 is a valid requestId)
console.error(CS_LOG_PREFIX, `Received DEBUGGER_RESPONSE without a valid message.requestId. Ignoring. Message:`, message);
sendResponse({ success: false, error: "DEBUGGER_RESPONSE missing requestId." });
return true;
}
// Pass the raw data, the message's requestId, and isFinal flag to the provider
// The provider's handleDebuggerData is responsible for calling handleProviderResponse
console.log(CS_LOG_PREFIX, `Calling provider.handleDebuggerData for requestId: ${message.requestId} with isFinal: ${message.isFinal}`); // Log before call
provider.handleDebuggerData(message.requestId, message.data, message.isFinal, handleProviderResponse);
// Acknowledge receipt of the debugger data
sendResponse({ success: true, message: "Debugger data passed to provider." });
return true; // Indicate async response (provider will eventually call handleProviderResponse)
} else if (message.type === "PING_TAB") {
console.log(CS_LOG_PREFIX, "Received PING_TAB from background script.");
sendResponse({ success: true, message: "PONG" });
return true;
} else if (message.action === "STOP_STREAMING") {
console.log(CS_LOG_PREFIX, `Received STOP_STREAMING command for requestId: ${message.requestId}`);
if (provider && typeof provider.stopStreaming === 'function') {
provider.stopStreaming(message.requestId);
// The handleProviderResponse might have already cleared currentRequestId if it matched.
// We ensure processingMessage is false if this was the active request.
if (currentRequestId === message.requestId) {
processingMessage = false;
currentRequestId = null; // Explicitly clear here as well
clearResponseMonitoringTimers(); // Ensure any DOM timers are also cleared
console.log(CS_LOG_PREFIX, `STOP_STREAMING: Cleared active currentRequestId ${message.requestId} and processingMessage flag.`);
}
sendResponse({ success: true, message: `Streaming stopped for requestId: ${message.requestId}` });
} else {
console.error(CS_LOG_PREFIX, "Provider or provider.stopStreaming is not available for STOP_STREAMING command.");
sendResponse({ success: false, error: "Provider or stopStreaming method missing." });
}
return true;
}
// Handle other potential message types if needed
// else if (message.type === '...') { ... }
// If the message type isn't handled, return false or undefined
console.log(CS_LOG_PREFIX, "Unhandled message type received:", message.type || message.action);
// sendResponse({ success: false, error: "Unhandled message type" }); // Optional: send error back
// return false; // Or let it be undefined
});
}
// Generic callback function passed to the provider.
// The provider calls this when it has determined the final response or a chunk of it.
function handleProviderResponse(requestId, responseText, isFinal) {
console.log(CS_LOG_PREFIX, `handleProviderResponse called for requestId: ${requestId}. Data length: ${responseText ? String(responseText).length : 'null'}. isFinal: ${isFinal}. Data (first 100 chars): '${(responseText || "").substring(0,100)}', Type: ${typeof responseText}`);
// The requestId parameter here is the one that the provider determined this response is for.
// This should be the definitive requestId for this piece of data.
// We log if content.js's currentRequestId is different, but proceed with the passed 'requestId'.
if (currentRequestId !== requestId && currentRequestId !== null) { // also check currentRequestId is not null to avoid warning on initial load or after reset
console.warn(CS_LOG_PREFIX, `handleProviderResponse: content.js currentRequestId (${currentRequestId}) differs from provider's response requestId (${requestId}). Proceeding with provider's requestId for data relay.`);
}
// Continue to process with the 'requestId' passed to this function.
if (chrome.runtime && chrome.runtime.sendMessage) {
const MAX_RESPONSE_TEXT_LENGTH = 500 * 1024; // 500KB limit for safety
let messageToSendToBackground;
if (responseText && typeof responseText === 'string' && responseText.length > MAX_RESPONSE_TEXT_LENGTH) {
console.warn(CS_LOG_PREFIX, `ResponseText for requestId ${requestId} is too large (${responseText.length} bytes). Sending error and truncated text.`);
messageToSendToBackground = {
type: "FINAL_RESPONSE_TO_RELAY",
requestId: requestId,
error: `Response too large to transmit (length: ${responseText.length}). Check content script logs for truncated version.`,
// text: responseText.substring(0, MAX_RESPONSE_TEXT_LENGTH) + "... (truncated by content.js)", // Optionally send truncated
text: `Error: Response too large (length: ${responseText.length}). See AI Studio for full response.`, // Simpler error text
isFinal: true // This is a final error state
};
} else {
messageToSendToBackground = {
type: "FINAL_RESPONSE_TO_RELAY",
requestId: requestId,
text: responseText, // Can be null if AIStudioProvider parsed it as such
isFinal: isFinal
};
}
console.log(CS_LOG_PREFIX, `[REQ-${requestId}] PRE-SEND to BG: Type: ${messageToSendToBackground.type}, isFinal: ${messageToSendToBackground.isFinal}, HasError: ${!!messageToSendToBackground.error}, TextLength: ${messageToSendToBackground.text ? String(messageToSendToBackground.text).length : (messageToSendToBackground.error ? String(messageToSendToBackground.error).length : 'N/A')}`);
try {
chrome.runtime.sendMessage(messageToSendToBackground, response => {
if (chrome.runtime.lastError) {
console.error(CS_LOG_PREFIX, `[REQ-${requestId}] SEND FAILED to BG: ${chrome.runtime.lastError.message}. Message attempted:`, JSON.stringify(messageToSendToBackground).substring(0, 500));
} else {
console.log(CS_LOG_PREFIX, `[REQ-${requestId}] SEND SUCCESS to BG. Ack from BG:`, response);
}
});
} catch (syncError) {
console.error(CS_LOG_PREFIX, `[REQ-${requestId}] SYNC ERROR sending to BG: ${syncError.message}. Message attempted:`, JSON.stringify(messageToSendToBackground).substring(0, 500), syncError);
}
} else {
console.error(CS_LOG_PREFIX, "Cannot send FINAL_RESPONSE_TO_RELAY, runtime is invalid.");
}
if (isFinal) {
// Reset content script state AFTER sending the final response message,
// but only if the finalized requestId matches what content.js currently considers its active request.
if (currentRequestId === requestId) {
processingMessage = false;
currentRequestId = null;
clearResponseMonitoringTimers(); // Clear any timers associated with this request
console.log(CS_LOG_PREFIX, `Processing finished for active requestId: ${requestId}. State reset in content.js.`);
} else {
console.log(CS_LOG_PREFIX, `Processing finished for requestId: ${requestId}. This was not the active content.js requestId (${currentRequestId}), so content.js state not altered by this finalization. However, timers for ${requestId} might need explicit cleanup if any were started by it.`);
// If specific timers were associated with 'requestId' (not currentRequestId), they should be cleared by the provider or a more granular timer management.
}
} else {
console.log(CS_LOG_PREFIX, `Partial response processed for requestId: ${requestId}. Awaiting more data or final flag.`);
}
}
// Call initialization functions
// Ensure DOM is ready for provider detection and DOM manipulations
if (document.readyState === "loading") {
document.addEventListener("DOMContentLoaded", attemptInitialization);
} else {
attemptInitialization(); // DOMContentLoaded has already fired
}
function attemptInitialization() {
console.log(CS_LOG_PREFIX, "Attempting initialization...");
if (window.attemptedInitialization) {
console.log(CS_LOG_PREFIX, "Initialization already attempted. Skipping.");
return;
}
window.attemptedInitialization = true;
initializeContentRelay(); // Initialize provider detection, DOM setup, etc.
setupMessageListeners(); // Setup listeners for messages from background script
console.log(CS_LOG_PREFIX, "Initialization attempt complete. Message listeners set up.");
}

View file

71
extension/manifest.json Normal file
View file

@ -0,0 +1,71 @@
{
"manifest_version": 3,
"name": "AI Chat Relay",
"version": "1.0",
"description": "Relays messages between an OpenAI-compatible API server and chat interfaces",
"permissions": [
"scripting",
"storage",
"alarms",
"debugger",
"tabs"
],
"optional_permissions": [
],
"host_permissions": [
"*://*.google.com/*",
"*://*.chatgpt.com/*",
"*://*.aistudio.com/*",
"*://*.claude.ai/*",
"ws://localhost:*/"
],
"background": {
"service_worker": "background.js"
},
"content_scripts": [
{
"matches": ["*://*.chatgpt.com/*"],
"js": [
"providers/provider-utils.js",
"providers/chatgpt.js",
"content.js"
],
"run_at": "document_idle"
},
{
"matches": ["*://aistudio.google.com/*"],
"js": [
"providers/provider-utils.js",
"providers/aistudio.js",
"content.js"
],
"run_at": "document_idle"
},
{
"matches": ["*://claude.ai/*"],
"js": [
"providers/provider-utils.js",
"providers/claude.js",
"content.js"
],
"run_at": "document_idle"
},
{
"matches": ["*://gemini.google.com/*"],
"js": [
"providers/provider-utils.js",
"providers/gemini.js",
"content.js"
],
"run_at": "document_idle"
}
],
"action": {
"default_title": "AI Chat Relay",
"default_popup": "popup.html"
},
"options_ui": {
"page": "options.html",
"open_in_tab": true
}
}

98
extension/options.html Normal file
View file

@ -0,0 +1,98 @@
<!--
Chat Relay: Relay for AI Chat Interfaces
Copyright (C) 2025 Jamison Moore
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see https://www.gnu.org/licenses/.
-->
<!DOCTYPE html>
<html>
<head>
<title>AI Chat Relay Options</title>
<style>
body {
font-family: Arial, sans-serif;
padding: 20px;
max-width: 600px;
margin: 0 auto;
}
.form-group {
margin-bottom: 15px;
}
label {
display: block;
margin-bottom: 5px;
font-weight: bold;
}
input[type="text"], input[type="number"] {
width: 100%;
padding: 8px;
box-sizing: border-box;
border: 1px solid #ccc;
border-radius: 4px;
}
button {
background-color: #4285f4;
color: white;
border: none;
padding: 10px 15px;
border-radius: 4px;
cursor: pointer;
}
button:hover {
background-color: #3367d6;
}
.status {
margin-top: 15px;
padding: 10px;
border-radius: 4px;
display: none;
}
.success {
background-color: #d4edda;
color: #155724;
}
.error {
background-color: #f8d7da;
color: #721c24;
}
</style>
</head>
<body>
<h1>AI Chat Relay Options</h1>
<div class="form-group">
<label for="serverHost">Server Host:</label>
<input type="text" id="serverHost" placeholder="localhost">
</div>
<div class="form-group">
<label for="serverPort">Server Port:</label>
<input type="number" id="serverPort" min="1" max="65535" placeholder="3003">
</div>
<div class="form-group">
<label for="serverProtocol">Server Protocol:</label>
<select id="serverProtocol">
<option value="ws">WebSocket (ws://)</option>
<option value="wss">Secure WebSocket (wss://)</option>
</select>
</div>
<button id="save">Save Settings</button>
<div id="status" class="status"></div>
<script src="options.js"></script>
</body>
</html>

95
extension/options.js Normal file
View file

@ -0,0 +1,95 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
// Default settings
const DEFAULT_SETTINGS = {
serverHost: 'localhost',
serverPort: 3003,
serverProtocol: 'ws'
};
// DOM elements
const hostInput = document.getElementById('serverHost');
const portInput = document.getElementById('serverPort');
const protocolSelect = document.getElementById('serverProtocol');
const saveButton = document.getElementById('save');
const statusDiv = document.getElementById('status');
// Load saved settings
function loadSettings() {
chrome.storage.sync.get(DEFAULT_SETTINGS, (items) => {
hostInput.value = items.serverHost;
portInput.value = items.serverPort;
protocolSelect.value = items.serverProtocol;
});
}
// Save settings
function saveSettings() {
const settings = {
serverHost: hostInput.value.trim() || DEFAULT_SETTINGS.serverHost,
serverPort: parseInt(portInput.value) || DEFAULT_SETTINGS.serverPort,
serverProtocol: protocolSelect.value
};
chrome.storage.sync.set(settings, () => {
// Show success message
showStatus('Settings saved successfully!', 'success');
// Update host permissions if needed
updateHostPermissions(settings);
});
}
// Show status message
function showStatus(message, type) {
statusDiv.textContent = message;
statusDiv.className = 'status ' + type;
statusDiv.style.display = 'block';
// Hide after 3 seconds
setTimeout(() => {
statusDiv.style.display = 'none';
}, 3000);
}
// Update host permissions if needed
function updateHostPermissions(settings) {
const url = `${settings.serverProtocol}://${settings.serverHost}:${settings.serverPort}/`;
// Check if we already have permission
chrome.permissions.contains({
origins: [url]
}, (hasPermission) => {
if (!hasPermission) {
// Request new permission
chrome.permissions.request({
origins: [url]
}, (granted) => {
if (granted) {
console.log(`Permission granted for ${url}`);
} else {
showStatus('Warning: Permission not granted for the server URL. The extension may not work correctly.', 'error');
}
});
}
});
}
// Event listeners
document.addEventListener('DOMContentLoaded', loadSettings);
saveButton.addEventListener('click', saveSettings);

81
extension/popup.html Normal file
View file

@ -0,0 +1,81 @@
<!--
Chat Relay: Relay for AI Chat Interfaces
Copyright (C) 2025 Jamison Moore
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as
published by the Free Software Foundation, either version 3 of the
License, or (at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see https://www.gnu.org/licenses/.
-->
<!DOCTYPE html>
<html>
<head>
<title>AI Chat Relay</title>
<style>
body {
font-family: Arial, sans-serif;
width: 300px;
padding: 15px;
}
h1 {
font-size: 18px;
margin-top: 0;
}
.status {
margin: 10px 0;
padding: 8px;
border-radius: 4px;
}
.connected {
background-color: #d4edda;
color: #155724;
}
.disconnected {
background-color: #f8d7da;
color: #721c24;
}
button {
background-color: #4285f4;
color: white;
border: none;
padding: 8px 12px;
border-radius: 4px;
cursor: pointer;
margin-top: 10px;
width: 100%;
}
button:hover {
background-color: #3367d6;
}
.footer {
margin-top: 15px;
font-size: 12px;
color: #666;
text-align: center;
}
</style>
</head>
<body>
<h1>AI Chat Relay</h1>
<div id="connectionStatus" class="status disconnected">
Checking connection status...
</div>
<button id="openOptions">Open Settings</button>
<div class="footer">
<p>Current server: <span id="serverUrl">ws://localhost:3003</span></p>
</div>
<script src="popup.js"></script>
</body>
</html>

75
extension/popup.js Normal file
View file

@ -0,0 +1,75 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
// DOM elements
const connectionStatus = document.getElementById('connectionStatus');
const openOptionsButton = document.getElementById('openOptions');
const serverUrlSpan = document.getElementById('serverUrl');
// Default settings
const DEFAULT_SETTINGS = {
serverHost: 'localhost',
serverPort: 3003,
serverProtocol: 'ws'
};
// Load settings and update UI
function loadSettings() {
chrome.storage.sync.get(DEFAULT_SETTINGS, (items) => {
const serverUrl = `${items.serverProtocol}://${items.serverHost}:${items.serverPort}`;
serverUrlSpan.textContent = serverUrl;
// Check connection status
checkConnectionStatus();
});
}
// Check connection status with the background script
function checkConnectionStatus() {
chrome.runtime.sendMessage({ action: "GET_CONNECTION_STATUS" }, (response) => {
if (chrome.runtime.lastError) {
updateConnectionStatus(false);
return;
}
if (response && response.connected) {
updateConnectionStatus(true);
} else {
updateConnectionStatus(false);
}
});
}
// Update the connection status UI
function updateConnectionStatus(isConnected) {
if (isConnected) {
connectionStatus.className = 'status connected';
connectionStatus.textContent = 'Connected to relay server';
} else {
connectionStatus.className = 'status disconnected';
connectionStatus.textContent = 'Disconnected from relay server';
}
}
// Open the options page
function openOptions() {
chrome.runtime.openOptionsPage();
}
// Event listeners
document.addEventListener('DOMContentLoaded', loadSettings);
openOptionsButton.addEventListener('click', openOptions);

View file

@ -0,0 +1,737 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
// AI Chat Relay - AI Studio Provider
class AIStudioProvider {
constructor() {
// --- START OF CONFIGURABLE PROPERTIES ---
// Method for response capture: "debugger" or "dom"
this.captureMethod = "debugger";
// URL pattern for debugger to intercept if captureMethod is "debugger". Ensure this is specific.
this.debuggerUrlPattern = "*MakerSuiteService/GenerateContent*"; // VERIFY THIS PATTERN
// Whether to include "thinking" process in the message or just the final answer.
// If true, parseDebuggerResponse returns a JSON string: { "thinking": "...", "answer": "..." }
// If false, parseDebuggerResponse returns a string: "answer"
this.includeThinkingInMessage = false;
// Option to enable AI Studio function calling on load
// ENABLE_AISTUDIO_FUNCTION_CALLING: true or false
this.ENABLE_AISTUDIO_FUNCTION_CALLING = true;
// --- END OF CONFIGURABLE PROPERTIES ---
this.name = "AIStudioProvider"; // Updated name
this.supportedDomains = ["aistudio.google.com"];
// Selectors for the AI Studio interface
this.inputSelector = 'textarea.textarea, textarea.gmat-body-medium, textarea[aria-label="Type something or pick one from prompt gallery"]';
// The send button selector
this.sendButtonSelector = 'button.run-button, button[aria-label="Run"], button.mat-mdc-tooltip-trigger.run-button';
// Updated response selectors based on the actual elements
this.responseSelector = '.response-container, .response-text, .model-response, .model-response-container, ms-chat-turn, ms-prompt-chunk, ms-text-chunk, .very-large-text-container, .cmark-node';
// Thinking indicator selector
this.thinkingIndicatorSelector = '.thinking-indicator, .loading-indicator, .typing-indicator, .response-loading, loading-indicator';
// Fallback selectors
this.responseSelectorForDOMFallback = '.response-container, .model-response-text'; // Placeholder, adjust as needed
this.thinkingIndicatorSelectorForDOM = '.thinking-indicator, .spinner'; // Placeholder, adjust as needed
// Last sent message to avoid capturing it as a response
this.lastSentMessage = '';
// Initialize pendingResponseCallbacks
this.pendingResponseCallbacks = new Map();
// Call the method to ensure function calling is enabled on initial load
this.ensureFunctionCallingEnabled();
// Listen for SPA navigation events to re-trigger the check
if (window.navigation) {
window.navigation.addEventListener('navigate', (event) => {
// We are interested in same-document navigations, common in SPAs
if (!event.canIntercept || event.hashChange || event.downloadRequest !== null) {
return;
}
// Check if the navigation is within the same origin and path structure of AI Studio
const currentUrl = new URL(window.location.href);
const destinationUrl = new URL(event.destination.url);
if (currentUrl.origin === destinationUrl.origin && destinationUrl.pathname.startsWith("/prompts/")) {
console.log(`[${this.name}] Detected SPA navigation to: ${event.destination.url}. Re-checking function calling toggle.`);
// Use a timeout to allow the new view's DOM to settle
setTimeout(() => {
this.ensureFunctionCallingEnabled();
}, 1000); // Delay to allow DOM update
}
});
} else {
console.warn(`[${this.name}] window.navigation API not available. Function calling toggle may not re-enable on SPA navigations.`);
}
}
ensureFunctionCallingEnabled() {
if (!this.ENABLE_AISTUDIO_FUNCTION_CALLING) {
console.log(`[${this.name}] Function calling is disabled by configuration. Skipping.`);
return;
}
const checkInterval = 500; // ms
const maxDuration = 7000; // ms
let elapsedTime = 0;
const providerName = this.name;
// Clear any existing timer for this specific functionality to avoid multiple polling loops
if (this.functionCallingPollTimer) {
clearTimeout(this.functionCallingPollTimer);
this.functionCallingPollTimer = null;
console.log(`[${providerName}] Cleared previous function calling poll timer.`);
}
console.log(`[${providerName}] Ensuring function calling is enabled (polling up to ${maxDuration / 1000}s).`);
const tryEnableFunctionCalling = () => {
console.log(`[${providerName}] Polling for function calling toggle. Elapsed: ${elapsedTime}ms`);
const functionCallingToggle = document.querySelector('button[aria-label="Function calling"]');
if (functionCallingToggle) {
const isChecked = functionCallingToggle.getAttribute('aria-checked') === 'true';
if (!isChecked) {
console.log(`[${providerName}] Function calling toggle found and is NOT checked. Attempting to enable...`);
functionCallingToggle.click();
// Verify after a short delay if the click was successful
setTimeout(() => {
const stillChecked = functionCallingToggle.getAttribute('aria-checked') === 'true';
if (stillChecked) {
console.log(`[${providerName}] Function calling successfully enabled after click.`);
} else {
console.warn(`[${providerName}] Clicked function calling toggle, but it did NOT become checked. It might be disabled or unresponsive.`);
}
}, 200);
} else {
console.log(`[${providerName}] Function calling toggle found and is already enabled.`);
}
this.functionCallingPollTimer = null; // Clear timer once action is taken or element found
} else {
elapsedTime += checkInterval;
if (elapsedTime < maxDuration) {
console.log(`[${providerName}] Function calling toggle not found, will retry in ${checkInterval}ms.`);
this.functionCallingPollTimer = setTimeout(tryEnableFunctionCalling, checkInterval);
} else {
console.warn(`[${providerName}] Function calling toggle button (selector: 'button[aria-label="Function calling"]') not found after ${maxDuration}ms. It might not be available on this page/view or selector is incorrect.`);
this.functionCallingPollTimer = null; // Clear timer
}
}
};
// Start the first attempt after a brief initial delay
this.functionCallingPollTimer = setTimeout(tryEnableFunctionCalling, 500);
}
// Send a message to the chat interface
async sendChatMessage(messageContent) {
console.log(`[${this.name}] sendChatMessage called with content type:`, typeof messageContent, Array.isArray(messageContent) ? `Array length: ${messageContent.length}` : '');
const inputField = document.querySelector(this.inputSelector);
const sendButton = document.querySelector(this.sendButtonSelector);
if (!inputField || !sendButton) {
console.error(`[${this.name}] Missing input field or send button. Input: ${this.inputSelector}, Button: ${this.sendButtonSelector}`);
return false;
}
console.log(`[${this.name}] Attempting to send message to AI Studio with:`, {
inputField: inputField.className,
sendButton: sendButton.getAttribute('aria-label') || sendButton.className
});
try {
let textToInput = "";
let blobToPaste = null;
let blobMimeType = "image/png"; // Default MIME type
if (typeof messageContent === 'string') {
textToInput = messageContent;
this.lastSentMessage = textToInput;
console.log(`[${this.name}] Handling string content:`, textToInput.substring(0, 100) + "...");
} else if (messageContent instanceof Blob) {
blobToPaste = messageContent;
blobMimeType = messageContent.type || blobMimeType;
this.lastSentMessage = `Blob data (type: ${blobMimeType}, size: ${blobToPaste.size})`;
console.log(`[${this.name}] Handling Blob content. Size: ${blobToPaste.size}, Type: ${blobMimeType}`);
} else if (Array.isArray(messageContent)) {
console.log(`[${this.name}] Handling array content.`);
for (const part of messageContent) {
if (part.type === "text" && typeof part.text === 'string') {
textToInput += (textToInput ? "\n" : "") + part.text;
console.log(`[${this.name}] Added text part:`, part.text.substring(0, 50) + "...");
} else if (part.type === "image_url" && part.image_url && typeof part.image_url.url === 'string') {
if (!blobToPaste) { // Prioritize the first image found
try {
const response = await fetch(part.image_url.url);
blobToPaste = await response.blob();
blobMimeType = blobToPaste.type || blobMimeType;
console.log(`[${this.name}] Fetched image_url as Blob. Size: ${blobToPaste.size}, Type: ${blobMimeType}`);
} catch (e) {
console.error(`[${this.name}] Error fetching image_url ${part.image_url.url}:`, e);
}
} else {
console.warn(`[${this.name}] Multiple image_urls found, only the first will be pasted.`);
}
}
}
this.lastSentMessage = `Array content (Text: "${textToInput.substring(0,50)}...", Image: ${blobToPaste ? 'Yes' : 'No'})`;
} else {
console.error(`[${this.name}] Unhandled message content type: ${typeof messageContent}. Cannot send.`);
this.lastSentMessage = `Unhandled data type: ${typeof messageContent}`;
return false;
}
// Set text input if any
if (textToInput) {
inputField.value = textToInput;
inputField.dispatchEvent(new Event('input', { bubbles: true }));
console.log(`[${this.name}] Set input field value with accumulated text.`);
} else {
// If there's no text but an image, ensure the input field is clear if AI Studio requires it
// inputField.value = "";
// inputField.dispatchEvent(new Event('input', { bubbles: true }));
}
// Paste blob if any
if (blobToPaste) {
const dataTransfer = new DataTransfer();
const file = new File([blobToPaste], "pasted_image." + (blobMimeType.split('/')[1] || 'png'), { type: blobMimeType });
dataTransfer.items.add(file);
const pasteEvent = new ClipboardEvent('paste', {
clipboardData: dataTransfer,
bubbles: true,
cancelable: true
});
inputField.dispatchEvent(pasteEvent);
console.log(`[${this.name}] Dispatched paste event with Blob data.`);
}
inputField.focus();
await new Promise(resolve => setTimeout(resolve, 100));
let attempts = 0;
const maxAttempts = 60; // Try up to 60 times (5 minutes total)
const retryDelay = 5000; // 5 seconds delay between attempts
while (attempts < maxAttempts) {
const isDisabled = sendButton.disabled ||
sendButton.getAttribute('aria-disabled') === 'true' ||
sendButton.classList.contains('disabled');
if (!isDisabled) {
// Removed check for input field content matching lastSentMessage
// as it can cause issues when there are multiple messages waiting to be sent
console.log(`[${this.name}] Send button is enabled. Clicking send button (attempt ${attempts + 1}).`);
sendButton.click();
return true; // Successfully clicked
}
attempts++;
if (attempts >= maxAttempts) {
console.error(`[${this.name}] Send button remained disabled after ${maxAttempts} attempts. Failed to send message.`);
return false; // Failed to send
}
console.log(`[${this.name}] Send button is disabled (attempt ${attempts}). Trying to enable and will retry in ${retryDelay}ms.`);
// Attempt to trigger UI updates that might enable the button
inputField.dispatchEvent(new Event('input', { bubbles: true })); // Re-dispatch input
inputField.dispatchEvent(new Event('change', { bubbles: true }));
inputField.dispatchEvent(new Event('blur', { bubbles: true }));
// Focusing and bluring input sometimes helps enable send buttons
inputField.focus();
await new Promise(resolve => setTimeout(resolve, 50)); // Short delay for focus
inputField.blur();
await new Promise(resolve => setTimeout(resolve, retryDelay));
}
// Should not be reached if logic is correct, but as a fallback:
console.error(`[${this.name}] Exited send button check loop unexpectedly.`);
return false;
} catch (error) {
console.error(`[${this.name}] Error sending message to AI Studio:`, error);
return false;
}
}
initiateResponseCapture(requestId, responseCallback) {
console.log(`[${this.name}] initiateResponseCapture called for requestId: ${requestId}. CURRENT CAPTURE METHOD IS: ${this.captureMethod}`);
if (this.captureMethod === "debugger") {
this.pendingResponseCallbacks.set(requestId, responseCallback);
console.log(`[${this.name}] Stored callback for debugger response, requestId: ${requestId}`);
} else if (this.captureMethod === "dom") {
console.log(`[${this.name}] Starting DOM monitoring for requestId: ${requestId}`);
this.pendingResponseCallbacks.set(requestId, responseCallback);
this._stopDOMMonitoring();
this._startDOMMonitoring(requestId);
} else {
console.error(`[${this.name}] Unknown capture method: ${this.captureMethod}`);
responseCallback(requestId, `[Error: Unknown capture method '${this.captureMethod}' in provider]`, true);
this.pendingResponseCallbacks.delete(requestId);
}
}
handleDebuggerData(requestId, rawData, isFinalFromBackground) { // Renamed isFinal to isFinalFromBackground for clarity
console.log(`[${this.name}] handleDebuggerData called for requestId: ${requestId}. Raw data length: ${rawData ? rawData.length : 'null'}. isFinalFromBackground: ${isFinalFromBackground}`);
const callback = this.pendingResponseCallbacks.get(requestId);
if (!callback) {
console.warn(`[${this.name}] No pending callback found for debugger data with requestId: ${requestId}. Ignoring.`);
return;
}
let parsedText = "";
let contentHasInternalFinalMarker = false;
if (rawData && rawData.trim() !== "") {
const parseOutput = this.parseDebuggerResponse(rawData);
parsedText = parseOutput.text;
contentHasInternalFinalMarker = parseOutput.isFinalResponse; // Use the parser's determination
console.log(`[${this.name}] Debugger data parsed for requestId: ${requestId}. Parsed text (first 100 chars): '${(parsedText || "").substring(0,100)}', Type: ${typeof parsedText}, ChunkHasFinalMarkerFromParser: ${contentHasInternalFinalMarker}`);
} else {
console.log(`[${this.name}] Received empty rawData from debugger for requestId: ${requestId}. isFinalFromBackground: ${isFinalFromBackground}`);
// If rawData is empty, text remains empty.
// If background says it's final, but data is empty, it's still final.
}
// The response is considered final for the callback if:
// 1. The background script explicitly states this is the final debugger event for the request OR
// 2. The provider's own parsing of the current chunk's content indicates it's the end of the AI's message.
const isFinalForCallback = isFinalFromBackground || contentHasInternalFinalMarker;
console.log(`[${this.name}] Calling callback for requestId ${requestId} with text (first 100): '${(parsedText || "").substring(0,100)}', isFinalForCallback: ${isFinalForCallback} (isFinalFromBackground: ${isFinalFromBackground}, contentHasInternalFinalMarker: ${contentHasInternalFinalMarker})`);
callback(requestId, parsedText, isFinalForCallback);
// If the callback was told this is the final response, then clean up.
if (isFinalForCallback) {
console.log(`[${this.name}] Final event processed for requestId: ${requestId} (isFinalForCallback was true). Removing callback.`);
this.pendingResponseCallbacks.delete(requestId);
}
}
// --- Internal DOM Capture Logic (largely unchanged but kept for completeness) ---
_captureResponseDOM(element = null) {
console.log(`[${this.name}] _captureResponseDOM (DOM method) called with element:`, element);
if (!element && this.captureMethod === "dom") {
const elements = document.querySelectorAll(this.responseSelector);
if (elements.length > 0) {
element = elements[elements.length - 1];
console.log(`[${this.name}] _captureResponseDOM: Found element via querySelector during polling.`);
}
}
if (!element) {
console.log(`[${this.name}] _captureResponseDOM: No element provided or found.`);
return { found: false, text: '' };
}
if (this._isResponseStillGeneratingDOM()) {
console.log(`[${this.name}] Response is still being generated (_isResponseStillGeneratingDOM check), waiting for completion`);
return { found: false, text: '' };
}
console.log(`[${this.name}] Attempting to capture DOM response from AI Studio...`);
let responseText = "";
let foundResponse = false;
try {
console.log("AISTUDIO: Looking for response in various elements...");
if (element.textContent) {
console.log("AISTUDIO: Element has text content");
responseText = element.textContent.trim();
if (responseText &&
// Removed check for responseText !== this.lastSentMessage
!responseText.includes("Loading") &&
!responseText.includes("Thinking") &&
!responseText.includes("Expand to view model thoughts")) {
console.log("AISTUDIO: Found response in element:", responseText.substring(0, 50) + (responseText.length > 50 ? "..." : ""));
foundResponse = true;
} else {
console.log("AISTUDIO: Element text appears to be invalid:", responseText.substring(0, 50) + (responseText.length > 50 ? "..." : ""));
}
} else {
console.log("AISTUDIO: Element has no text content");
}
console.log("AISTUDIO: Trying to find the most recent chat turn...");
const chatTurns = document.querySelectorAll('ms-chat-turn');
if (chatTurns && chatTurns.length > 0) {
console.log(`AISTUDIO: Found ${chatTurns.length} chat turns`);
const lastChatTurn = chatTurns[chatTurns.length - 1];
const isModelTurn = lastChatTurn.querySelector('.model-prompt-container');
if (isModelTurn) {
console.log("AISTUDIO: Last chat turn is a model turn");
const allTextChunks = document.querySelectorAll('ms-text-chunk');
if (allTextChunks && allTextChunks.length > 0) {
console.log(`AISTUDIO: Found ${allTextChunks.length} ms-text-chunk elements in the document`);
const lastTextChunk = allTextChunks[allTextChunks.length - 1];
console.log("AISTUDIO: Last ms-text-chunk found:", lastTextChunk);
const responseSpan = lastTextChunk.querySelector('span.ng-star-inserted');
if (responseSpan) {
console.log("AISTUDIO: Found response span in last ms-text-chunk");
const text = responseSpan.textContent.trim();
if (text &&
// Removed check for text !== this.lastSentMessage
!text.includes("Loading") && !text.includes("Thinking") && !text.includes("Expand to view model thoughts")) {
responseText = text;
console.log("AISTUDIO: Found response in span:", responseText.substring(0, 50) + (responseText.length > 50 ? "..." : ""));
foundResponse = true;
}
} else {
console.log("AISTUDIO: No response span found, getting text directly from ms-text-chunk");
const text = lastTextChunk.textContent.trim();
if (text &&
// Removed check for text !== this.lastSentMessage
!text.includes("Loading") && !text.includes("Thinking") && !text.includes("Expand to view model thoughts")) {
responseText = text;
console.log("AISTUDIO: Found response in ms-text-chunk:", responseText.substring(0, 50) + (responseText.length > 50 ? "..." : ""));
foundResponse = true;
}
}
}
if (!foundResponse) {
const paragraphs = lastChatTurn.querySelectorAll('p');
if (paragraphs && paragraphs.length > 0) {
console.log(`AISTUDIO: Found ${paragraphs.length} paragraphs in last chat turn`);
let combinedText = "";
paragraphs.forEach((p) => {
const isInThoughtChunk = p.closest('ms-thought-chunk');
if (!isInThoughtChunk) {
const text = p.textContent.trim();
if (text &&
// Removed check for text !== this.lastSentMessage
!text.includes("Loading") && !text.includes("Thinking") && !text.includes("Expand to view model thoughts")) {
combinedText += text + "\n";
}
}
});
if (combinedText.trim()) {
responseText = combinedText.trim();
console.log("AISTUDIO: Found response in paragraphs:", responseText.substring(0, 50) + (responseText.length > 50 ? "..." : ""));
foundResponse = true;
}
}
}
}
}
if (!foundResponse) {
console.log("AISTUDIO: Trying to find ms-chat-turn elements (fallback)...");
const chatTurnsFallback = document.querySelectorAll('ms-chat-turn');
if (chatTurnsFallback && chatTurnsFallback.length > 0) {
const lastChatTurnFallback = chatTurnsFallback[chatTurnsFallback.length - 1];
const paragraphsFallback = lastChatTurnFallback.querySelectorAll('p');
if (paragraphsFallback && paragraphsFallback.length > 0) {
let combinedTextFallback = "";
paragraphsFallback.forEach((p) => {
const text = p.textContent.trim();
if (text &&
// Removed check for text !== this.lastSentMessage
!text.includes("Loading") && !text.includes("Thinking") && !text.includes("Expand to view model thoughts")) {
combinedTextFallback += text + "\n";
}
});
if (combinedTextFallback.trim()) {
responseText = combinedTextFallback.trim();
foundResponse = true;
}
}
if (!foundResponse) {
const textFallback = lastChatTurnFallback.textContent.trim();
if (textFallback &&
// Removed check for textFallback !== this.lastSentMessage
!textFallback.includes("Loading") && !textFallback.includes("Thinking") && !textFallback.includes("Expand to view model thoughts")) {
responseText = textFallback;
foundResponse = true;
}
}
}
}
if (!foundResponse) {
console.log("AISTUDIO: Trying to find .very-large-text-container elements...");
const textContainers = document.querySelectorAll('.very-large-text-container');
if (textContainers && textContainers.length > 0) {
for (let i = textContainers.length - 1; i >= 0; i--) {
const textContainer = textContainers[i];
const text = textContainer.textContent.trim();
if (text &&
// Removed check for text !== this.lastSentMessage
!text.includes("Loading") && !text.includes("Thinking") && !text.includes("Expand to view model thoughts")) {
responseText = text;
foundResponse = true;
break;
}
}
}
}
if (!foundResponse) {
console.log("AISTUDIO: Trying to find paragraphs in the document (last resort)...");
const paragraphsDoc = document.querySelectorAll('p');
if (paragraphsDoc && paragraphsDoc.length > 0) {
let combinedTextDoc = "";
for (let i = paragraphsDoc.length - 1; i >= 0; i--) {
const paragraph = paragraphsDoc[i];
const isUserChunk = paragraph.closest('.user-chunk');
if (isUserChunk) continue;
const text = paragraph.textContent.trim();
if (text &&
// Removed check for text !== this.lastSentMessage
!text.includes("Loading") && !text.includes("Thinking") && !text.includes("Expand to view model thoughts")) {
combinedTextDoc = text + "\n" + combinedTextDoc;
if (text.startsWith("Hello") || text.includes("I'm doing") || text.includes("How can I assist")) break;
}
}
if (combinedTextDoc.trim()) {
responseText = combinedTextDoc.trim();
foundResponse = true;
}
}
}
if (!foundResponse) {
console.log("AISTUDIO: Response not found yet via DOM.");
}
} catch (error) {
console.error("AISTUDIO: Error capturing response from AI Studio (DOM):", error);
}
if (foundResponse && responseText) {
responseText = responseText.trim()
.replace(/^(Loading|Thinking).*/gim, '')
.replace(/Expand to view model thoughts.*/gim, '')
.replace(/\n{3,}/g, '\n\n')
.trim();
}
return {
found: foundResponse && !!responseText.trim(),
text: responseText
};
}
// --- START OF CORRECTED DEBUGGER PARSING LOGIC ---
parseDebuggerResponse(jsonString) {
console.log(`[${this.name}] Parsing debugger response (AI Studio specific)... Input jsonString (first 200):`, jsonString ? jsonString.substring(0,200) : "null", "Type:", typeof jsonString);
if (!jsonString || jsonString.trim() === "") {
console.warn(`[${this.name}] parseDebuggerResponse called with empty or null jsonString.`);
return { text: "", isFinalResponse: false };
}
let thinkingAndProcessText = "";
let actualResponseText = "";
let overallMarkerFound = false;
function findEndOfUnitMarker(data) {
if (Array.isArray(data)) {
if (data.length >= 2 && data[data.length - 1] === 1 && data[data.length - 2] === "model") {
return true;
}
for (const item of data) {
if (findEndOfUnitMarker(item)) {
return true;
}
}
}
return false;
}
function extractTextSegments(data, segments = []) {
if (Array.isArray(data)) {
if (data.length > 1 && data[0] === null && typeof data[1] === 'string') {
segments.push(data[1]);
} else {
for (const item of data) {
extractTextSegments(item, segments);
}
}
}
return segments;
}
try {
const parsedJson = JSON.parse(jsonString);
if (Array.isArray(parsedJson)) {
for (let i = 0; i < parsedJson.length; i++) {
const chunk = parsedJson[i];
const textSegmentsInChunk = extractTextSegments(chunk);
if (textSegmentsInChunk.length > 0) {
actualResponseText += textSegmentsInChunk.join("");
}
if (findEndOfUnitMarker(chunk)) {
overallMarkerFound = true;
}
if (this.includeThinkingInMessage) {
if (Array.isArray(chunk) && chunk[0] && Array.isArray(chunk[0][0]) && chunk[0][0][2]) {
const potentialThinkingBlock = chunk[0][0][2];
const thinkingSegments = extractTextSegments(potentialThinkingBlock);
const thinkingBlockText = thinkingSegments.join("").trim();
if (thinkingBlockText && !actualResponseText.includes(thinkingBlockText)) {
thinkingAndProcessText += thinkingBlockText + "\n";
}
}
}
}
} else {
if (typeof parsedJson === 'string') {
actualResponseText = parsedJson;
overallMarkerFound = true;
} else {
console.warn(`[${this.name}] Parsed JSON is not an array as expected. Type: ${typeof parsedJson}. Content (first 100): ${JSON.stringify(parsedJson).substring(0,100)}`);
const genericText = extractTextSegments(parsedJson).join("");
if (genericText) {
actualResponseText = genericText;
overallMarkerFound = true;
} else {
actualResponseText = "[Error: Unexpected JSON structure from AI Studio]";
overallMarkerFound = true;
}
}
}
actualResponseText = actualResponseText.replace(/\\n/g, "\n").replace(/\n\s*\n/g, '\n').trim();
thinkingAndProcessText = thinkingAndProcessText.replace(/\\n/g, "\n").replace(/\n\s*\n/g, '\n').trim();
} catch (e) {
console.error(`[${this.name}] Error parsing AI Studio debugger response JSON:`, e, "Original string (first 200 chars):", jsonString.substring(0, 200));
const formattedFallback = this.formatOutput("", jsonString);
return { text: formattedFallback, isFinalResponse: true };
}
const formattedOutput = this.formatOutput(thinkingAndProcessText, actualResponseText);
if (formattedOutput.trim() === "" && overallMarkerFound) {
return { text: "", isFinalResponse: true };
}
return { text: formattedOutput, isFinalResponse: overallMarkerFound };
}
formatOutput(thinkingText, answerText) {
if (this.includeThinkingInMessage && thinkingText && thinkingText.trim() !== "") {
try {
const result = {
thinking: thinkingText.trim(),
answer: (answerText || "").trim()
};
return JSON.stringify(result);
} catch (e) {
console.error(`[${this.name}] Error stringifying thinking/answer object:`, e);
return (answerText || "").trim();
}
}
return (answerText || "").trim();
}
// --- END OF CORRECTED DEBUGGER PARSING LOGIC ---
// --- Other methods (DOM fallback, etc. - largely unchanged but included for completeness) ---
_findResponseElementDOM(container) {
console.log(`[${this.name}] _findResponseElementDOM called on container:`, container);
if (!container) return null;
const elements = container.querySelectorAll(this.responseSelectorForDOMFallback);
if (elements.length > 0) {
const lastElement = elements[elements.length - 1];
console.log(`[${this.name}] Found last response element via DOM:`, lastElement);
// Add checks to ensure it's not the user's input or an old response
if (lastElement.textContent && lastElement.textContent.trim() !== this.lastSentMessage) {
return lastElement;
}
}
console.log(`[${this.name}] No suitable response element found via DOM in container.`);
return null;
}
shouldSkipResponseMonitoring() {
// Example: if a provider indicates via a specific property or method
// For AIStudio, if using debugger, we don't need DOM monitoring.
// This method is more for providers that might sometimes use DOM, sometimes not.
// console.log(`[${this.name}] shouldSkipResponseMonitoring called. Capture method: ${this.captureMethod}`);
return this.captureMethod === "debugger";
}
_isResponseStillGeneratingDOM() {
// This is for the DOM fallback method
const thinkingIndicator = document.querySelector(this.thinkingIndicatorSelectorForDOM);
if (thinkingIndicator) {
// console.log(`[${this.name}] DOM Fallback: Thinking indicator found.`);
return true;
}
// console.log(`[${this.name}] DOM Fallback: No thinking indicator found.`);
return false;
}
getStreamingApiPatterns() {
console.log(`[${this.name}] getStreamingApiPatterns called. Capture method: ${this.captureMethod}`);
if (this.captureMethod === "debugger" && this.debuggerUrlPattern) {
console.log(`[${this.name}] Using debugger URL pattern: ${this.debuggerUrlPattern}`);
return [{ urlPattern: this.debuggerUrlPattern, requestStage: "Response" }];
}
console.log(`[${this.name}] No debugger patterns to return (captureMethod is not 'debugger' or no pattern set).`);
return [];
}
_startDOMMonitoring(requestId) {
console.log(`[${this.name}] DOM Fallback: _startDOMMonitoring for requestId: ${requestId}`);
this._stopDOMMonitoring(); // Stop any existing observer
const callback = this.pendingResponseCallbacks.get(requestId);
if (!callback) {
console.error(`[${this.name}] DOM Fallback: No callback for requestId ${requestId} in _startDOMMonitoring.`);
return;
}
let attempts = 0;
const maxAttempts = 15; // Try for ~15 seconds
const interval = 1000;
this.domMonitorTimer = setInterval(() => {
console.log(`[${this.name}] DOM Fallback: Polling attempt ${attempts + 1}/${maxAttempts} for requestId: ${requestId}`);
const responseData = this._captureResponseDOM(); // Will use this.responseSelectorForDOMFallback
if (responseData.found && responseData.text.trim() !== "") {
console.log(`[${this.name}] DOM Fallback: Response captured for requestId ${requestId}. Text (first 100): ${responseData.text.substring(0,100)}`);
this._stopDOMMonitoring();
callback(requestId, responseData.text, true); // Assume final for DOM capture
this.pendingResponseCallbacks.delete(requestId);
} else {
attempts++;
if (attempts >= maxAttempts) {
console.warn(`[${this.name}] DOM Fallback: Max attempts reached for requestId ${requestId}. No response captured.`);
this._stopDOMMonitoring();
callback(requestId, "[Error: Timed out waiting for DOM response]", true); // Error, final
this.pendingResponseCallbacks.delete(requestId);
}
}
}, interval);
console.log(`[${this.name}] DOM Fallback: Monitoring started with timer ID ${this.domMonitorTimer}`);
}
_stopDOMMonitoring() {
if (this.domMonitorTimer) {
console.log(`[${this.name}] DOM Fallback: Stopping DOM monitoring timer ID ${this.domMonitorTimer}`);
clearInterval(this.domMonitorTimer);
this.domMonitorTimer = null;
}
}
}
// Ensure the provider is available on the window for the content script
if (window.providerUtils) {
const providerInstance = new AIStudioProvider();
window.providerUtils.registerProvider(
providerInstance.name,
providerInstance.supportedDomains,
providerInstance
);
} else {
console.error("AIStudioProvider: providerUtils not found. Registration failed.");
}

View file

@ -0,0 +1,719 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
// AI Chat Relay - ChatGPT Provider
class ChatGptProvider {
constructor() {
// --- START OF CONFIGURABLE PROPERTIES ---
this.captureMethod = "debugger";
this.debuggerUrlPattern = "*chatgpt.com/backend-api/conversation*";
this.includeThinkingInMessage = true;
// --- END OF CONFIGURABLE PROPERTIES ---
this.name = "ChatGptProvider";
this.supportedDomains = ["chatgpt.com"];
this.inputSelector = '#prompt-textarea';
this.sendButtonSelector = 'button[data-testid="send-button"]'; // Use data-testid
this.responseSelector = '.message-bubble .text-content';
this.thinkingIndicatorSelector = '.loading-spinner';
this.responseSelectorForDOMFallback = '.message-container .response-text';
this.thinkingIndicatorSelectorForDOM = '.thinking-dots, .spinner-animation';
this.lastSentMessage = '';
this.pendingResponseCallbacks = new Map();
this.requestAccumulators = new Map();
this.domMonitorTimer = null;
console.log(`[${this.name}] Provider initialized for domains: ${this.supportedDomains.join(', ')}`);
}
async sendChatMessage(messageContent, requestId) { // Changed parameter name
console.log(`[${this.name}] sendChatMessage called for requestId ${requestId} with content type:`, typeof messageContent, Array.isArray(messageContent) ? `Array length: ${messageContent.length}` : '');
const MAX_RETRIES = 5;
const RETRY_DELAY_MS_BASE = 250;
// --- 1. Find and check the input field ---
const inputField = document.querySelector(this.inputSelector);
if (!inputField) {
console.error(`[${this.name}] Input field (selector: ${this.inputSelector}) not found for requestId ${requestId}.`);
this._reportSendError(requestId, `Input field not found: ${this.inputSelector}`);
return false;
}
if (inputField.disabled || inputField.hasAttribute('disabled')) {
console.warn(`[${this.name}] Input field (selector: ${this.inputSelector}) is disabled for requestId ${requestId}.`);
this._reportSendError(requestId, `Input field is disabled: ${this.inputSelector}`);
return false;
}
// --- 2. Prepare and set content ONCE ---
try {
let textToInput = "";
let blobToPaste = null;
let blobMimeType = "image/png"; // Default
if (typeof messageContent === 'string') {
textToInput = messageContent;
this.lastSentMessage = textToInput;
console.log(`[${this.name}] Handling string content for requestId ${requestId}:`, textToInput.substring(0, 70) + "...");
} else if (messageContent instanceof Blob) {
blobToPaste = messageContent;
blobMimeType = messageContent.type || blobMimeType;
this.lastSentMessage = `Blob data (type: ${blobMimeType}, size: ${blobToPaste.size}) for requestId ${requestId}`;
console.log(`[${this.name}] Handling Blob content for requestId ${requestId}. Size: ${blobToPaste.size}, Type: ${blobMimeType}`);
} else if (Array.isArray(messageContent)) {
console.log(`[${this.name}] Handling array content for requestId ${requestId}.`);
for (const part of messageContent) {
if (part.type === "text" && typeof part.text === 'string') {
textToInput += (textToInput ? "\n" : "") + part.text;
} else if (part.type === "image_url" && part.image_url && typeof part.image_url.url === 'string') {
if (!blobToPaste) { // Prioritize the first image
try {
const response = await fetch(part.image_url.url);
blobToPaste = await response.blob();
blobMimeType = blobToPaste.type || blobMimeType;
console.log(`[${this.name}] Fetched image_url as Blob for requestId ${requestId}. Size: ${blobToPaste.size}, Type: ${blobMimeType}`);
} catch (e) {
console.error(`[${this.name}] Error fetching image_url ${part.image_url.url} for requestId ${requestId}:`, e);
// Optionally report error and return false if image is critical
}
} else {
console.warn(`[${this.name}] Multiple image_urls found for requestId ${requestId}, only the first will be processed.`);
}
}
}
this.lastSentMessage = `Array content (Text: "${textToInput.substring(0,50)}...", Image: ${blobToPaste ? 'Yes' : 'No'}) for requestId ${requestId}`;
} else {
console.error(`[${this.name}] Unhandled message content type: ${typeof messageContent} for requestId ${requestId}. Cannot send.`);
this.lastSentMessage = `Unhandled data type: ${typeof messageContent}`;
this._reportSendError(requestId, `Unhandled message content type: ${typeof messageContent}`);
return false;
}
// Set text input if any
if (textToInput) {
inputField.innerText = textToInput; // Use .innerText for contenteditable div
console.log(`[${this.name}] Set inputField.innerText for requestId ${requestId}.`);
} else {
inputField.innerText = ""; // Clear if only image or no text
console.log(`[${this.name}] Cleared inputField.innerText (no text part) for requestId ${requestId}.`);
}
inputField.dispatchEvent(new Event('input', { bubbles: true, cancelable: true }));
// Paste blob if any
if (blobToPaste) {
const dataTransfer = new DataTransfer();
const file = new File([blobToPaste], "pasted_image." + (blobMimeType.split('/')[1] || 'png'), { type: blobMimeType });
dataTransfer.items.add(file);
const pasteEvent = new ClipboardEvent('paste', {
clipboardData: dataTransfer,
bubbles: true,
cancelable: true
});
inputField.dispatchEvent(pasteEvent);
console.log(`[${this.name}] Dispatched paste event with Blob data for requestId ${requestId}.`);
}
inputField.focus();
await new Promise(resolve => setTimeout(resolve, 750)); // Delay for UI to update after content set
} catch (error) {
console.error(`[${this.name}] Error during content preparation for requestId ${requestId}:`, error);
this._reportSendError(requestId, `Exception during content preparation: ${error.message}`);
return false;
}
// --- 3. Retry loop for finding and clicking the send button ---
for (let attempt = 0; attempt < MAX_RETRIES; attempt++) {
const currentDelay = RETRY_DELAY_MS_BASE + (attempt * 100);
if (attempt > 0) await new Promise(resolve => setTimeout(resolve, currentDelay)); // No delay on first attempt of this loop
try {
const sendButton = document.querySelector(this.sendButtonSelector);
if (!sendButton) {
console.error(`[${this.name}] Send button (selector: ${this.sendButtonSelector}) not found on attempt ${attempt + 1} for requestId ${requestId}.`);
if (attempt === MAX_RETRIES - 1) {
this._reportSendError(requestId, `Send button not found: ${this.sendButtonSelector}`);
return false;
}
continue;
}
const isDisabled = sendButton.disabled ||
sendButton.hasAttribute('disabled') ||
sendButton.getAttribute('aria-disabled') === 'true' ||
sendButton.classList.contains('disabled');
console.log(`[${this.name}] Attempt ${attempt + 1} for requestId ${requestId} (Send Button Loop): Selector: '${this.sendButtonSelector}', Found: ${!!sendButton}, Disabled: ${isDisabled}, aria-disabled: ${sendButton.getAttribute('aria-disabled')}`);
if (!isDisabled) {
console.log(`[${this.name}] Clicking send button (selector: ${this.sendButtonSelector}) on attempt ${attempt + 1} for requestId ${requestId}.`);
sendButton.click();
console.log(`[${this.name}] Send button clicked for requestId ${requestId}. Returning true.`);
return true;
} else {
console.warn(`[${this.name}] Send button (selector: ${this.sendButtonSelector}) is disabled on attempt ${attempt + 1} for requestId ${requestId}.`);
// If button is disabled, try to trigger UI updates that might enable it
// These events are on inputField as they might influence the button's state
inputField.dispatchEvent(new Event('input', { bubbles: true, cancelable: true }));
inputField.dispatchEvent(new Event('change', { bubbles: true, cancelable: true }));
inputField.focus(); // Re-focus input field
await new Promise(resolve => setTimeout(resolve, 50)); // Short delay
if (attempt === MAX_RETRIES - 1) {
console.error(`[${this.name}] Send button still disabled on final attempt for requestId ${requestId}. Selectors: Input='${this.inputSelector}', Button='${this.sendButtonSelector}'.`);
this._reportSendError(requestId, `Send button remained disabled after ${MAX_RETRIES} attempts: ${this.sendButtonSelector}`);
return false;
}
}
} catch (error) {
console.error(`[${this.name}] Error during send button click attempt ${attempt + 1} for requestId ${requestId}:`, error);
if (attempt === MAX_RETRIES - 1) {
this._reportSendError(requestId, `Exception during send button click: ${error.message}`);
return false;
}
// Continue to next attempt if error is not on the last attempt
}
}
this._reportSendError(requestId, `Exhausted all retries for sendChatMessage (send button loop) for requestId ${requestId}.`);
return false;
}
_reportSendError(requestId, errorMessage) {
console.error(`[${this.name}] Reporting send error for requestId ${requestId}: ${errorMessage}`);
const callback = this.pendingResponseCallbacks.get(requestId);
if (callback) {
callback(requestId, `[PROVIDER_SEND_ERROR: ${errorMessage}]`, true);
this.pendingResponseCallbacks.delete(requestId);
this.requestAccumulators.delete(requestId);
} else {
console.warn(`[${this.name}] No callback found to report send error for requestId ${requestId}.`);
}
}
initiateResponseCapture(requestId, responseCallback) {
console.log(`[${this.name}] initiateResponseCapture called for requestId: ${requestId}. Capture method: ${this.captureMethod}`);
this.pendingResponseCallbacks.set(requestId, responseCallback);
if (this.captureMethod === "debugger") {
console.log(`[${this.name}] Debugger capture selected. Callback stored for requestId: ${requestId}. Ensure background script is set up for '${this.debuggerUrlPattern}'.`);
} else if (this.captureMethod === "dom") {
console.log(`[${this.name}] DOM capture selected. Starting DOM monitoring for requestId: ${requestId}`);
this._stopDOMMonitoring();
this._startDOMMonitoring(requestId);
} else {
console.error(`[${this.name}] Unknown capture method: ${this.captureMethod}`);
responseCallback(requestId, `[Error: Unknown capture method '${this.captureMethod}' in provider]`, true);
this.pendingResponseCallbacks.delete(requestId);
}
}
handleDebuggerData(requestId, rawData, isFinalFromBackground) {
console.log(`[${this.name}] handleDebuggerData ENTER - requestId: ${requestId}, isFinalFromBackground: ${isFinalFromBackground}, rawData: "${rawData ? rawData.substring(0,150) + (rawData.length > 150 ? "..." : "") : "null/empty"}"`);
const callback = this.pendingResponseCallbacks.get(requestId);
if (!callback) {
console.warn(`[${this.name}] handleDebuggerData - No callback for requestId: ${requestId}. RawData: ${rawData ? rawData.substring(0,50) : "null"}`);
return;
}
let accumulator = this.requestAccumulators.get(requestId);
if (!accumulator) {
accumulator = { text: "", isDefinitelyFinal: false, currentProcessingStage: undefined }; // Initialize stage
this.requestAccumulators.set(requestId, accumulator);
console.log(`[${this.name}] handleDebuggerData - Initialized new accumulator for ${requestId}: ${JSON.stringify(accumulator)}`);
}
console.log(`[${this.name}] handleDebuggerData - Accumulator state for ${requestId} BEFORE processing: ${JSON.stringify(accumulator)}`);
if (accumulator.isDefinitelyFinal) {
console.log(`[${this.name}] handleDebuggerData - Accumulator for ${requestId} is already final. Skipping.`);
return;
}
if (rawData && rawData.trim() !== "") {
let isLikelyNonChatJson = false;
if (!rawData.includes("data:") && rawData.trim().startsWith("{") && rawData.trim().endsWith("}")) {
try {
const jsonData = JSON.parse(rawData);
if (typeof jsonData.safe === 'boolean' && typeof jsonData.blocked === 'boolean') {
isLikelyNonChatJson = true;
console.log(`[${this.name}] handleDebuggerData - Detected likely non-chat JSON for ${requestId}, skipping parse.`);
}
} catch (e) { /* Not simple JSON */ }
}
if (isLikelyNonChatJson) {
// Ignore
} else {
const parseOutput = this.parseDebuggerResponse(rawData, accumulator.currentProcessingStage);
accumulator.currentProcessingStage = parseOutput.newProcessingStage; // Update stage
console.log(`[${this.name}] handleDebuggerData - requestId: ${requestId}, parseOutput: ${JSON.stringify(parseOutput)}`);
if (parseOutput.text !== null || parseOutput.operation === "replace") { // Check for null explicitly if empty string is valid
if (parseOutput.operation === "replace") {
console.log(`[${this.name}] handleDebuggerData - Operation: replace. Old text for ${requestId}: "${accumulator.text.substring(0,50)}...". New text: "${parseOutput.text ? parseOutput.text.substring(0,50) : "null"}..."`);
accumulator.text = parseOutput.text;
} else { // append
console.log(`[${this.name}] handleDebuggerData - Operation: append. Current text for ${requestId}: "${accumulator.text.substring(0,50)}...". Appending: "${parseOutput.text ? parseOutput.text.substring(0,50) : "null"}..."`);
accumulator.text += parseOutput.text;
}
}
console.log(`[${this.name}] handleDebuggerData - Accumulator text for ${requestId} AFTER update: "${accumulator.text.substring(0,100)}..."`);
if (parseOutput.isFinalResponse) {
accumulator.isDefinitelyFinal = true;
console.log(`[${this.name}] handleDebuggerData - ${requestId} marked as definitelyFinal by parseOutput.`);
}
// Invoke callback if there's new text, or if it's final, or if it was a replace operation (even with empty string)
if (parseOutput.text !== null || accumulator.isDefinitelyFinal || parseOutput.operation === "replace") {
console.log(`[${this.name}] handleDebuggerData - INVOKING CALLBACK for ${requestId}. Text: "${accumulator.text.substring(0,100)}...", isFinal: ${accumulator.isDefinitelyFinal}, Stage: ${accumulator.currentProcessingStage}`);
callback(requestId, accumulator.text, accumulator.isDefinitelyFinal);
}
}
} else {
if (isFinalFromBackground && !accumulator.isDefinitelyFinal) {
accumulator.isDefinitelyFinal = true;
console.log(`[${this.name}] handleDebuggerData - RawData empty, but isFinalFromBackground=true. INVOKING CALLBACK for ${requestId}. Text: "${accumulator.text.substring(0,100)}...", isFinal: true (forced)`);
callback(requestId, accumulator.text, accumulator.isDefinitelyFinal);
}
}
if (accumulator.isDefinitelyFinal) {
console.log(`[${this.name}] handleDebuggerData - CLEANING UP for ${requestId} as accumulator.isDefinitelyFinal is true.`);
this.pendingResponseCallbacks.delete(requestId);
this.requestAccumulators.delete(requestId);
}
}
// Parses the raw response from the debugger.
// Returns an object: { text: "content_from_this_chunk", isFinalResponse: boolean, operation: "replace" | "append", newProcessingStage: string }
parseDebuggerResponse(rawDataString, currentProcessingStage) {
let textForThisChunk = null; // Use null to distinguish from empty string if needed
let isFinalResponse = false;
let chunkOverallOperation = "append"; // Default to append
let newProcessingStage = currentProcessingStage;
console.log(`[${this.name}] parseDebuggerResponse ENTER. currentProcessingStage: ${currentProcessingStage}, includeThinking: ${this.includeThinkingInMessage}, rawDataString: "${rawDataString ? rawDataString.substring(0,100) + "..." : "null"}"`);
if (rawDataString === null || typeof rawDataString === 'undefined' || rawDataString.trim() === "") {
return { text: null, isFinalResponse: false, operation: "append", newProcessingStage };
}
// Skip non-SSE JSON like {"safe": true, "blocked": false}
if (!rawDataString.includes("data:") && rawDataString.trim().startsWith("{") && rawDataString.trim().endsWith("}")) {
try {
const jsonData = JSON.parse(rawDataString);
if (typeof jsonData.safe === 'boolean' && typeof jsonData.blocked === 'boolean') {
console.log(`[${this.name}] parseDebuggerResponse - Skipping non-chat JSON: ${rawDataString.substring(0,50)}`);
return { text: null, isFinalResponse: false, operation: "append", newProcessingStage };
}
} catch (e) { /* Fall through, might be a malformed SSE line or other JSON */ }
}
const lines = rawDataString.split('\n');
for (const line of lines) {
if (line.startsWith('data: ')) {
const dataJson = line.substring(6).trim();
if (dataJson === '[DONE]') {
isFinalResponse = true;
console.log(`[${this.name}] parseDebuggerResponse - Encountered [DONE]`);
break;
}
if (dataJson === "") continue;
try {
const data = JSON.parse(dataJson);
console.log(`[${this.name}] parseDebuggerResponse - Processing SSE data: ${JSON.stringify(data).substring(0,150)}...`);
let currentLineText = "";
let currentLineIsReplaceOperation = false; // Indicates if this specific line's content should replace prior content *within this chunk*
let messageNode = data.message;
if (data.p === "" && data.o === "add" && data.v && data.v.message) {
messageNode = data.v.message;
}
let contentType = null;
if (messageNode && messageNode.content && messageNode.content.content_type) {
contentType = messageNode.content.content_type;
}
console.log(`[${this.name}] parseDebuggerResponse - Identified contentType: ${contentType}, currentProcessingStage: ${newProcessingStage}`);
// --- Stage and Text Extraction Logic ---
if (this.includeThinkingInMessage) {
// --- INCLUDE THINKING: Extract text from thoughts and content ---
if (contentType === "thoughts") {
if (newProcessingStage !== "processing_thoughts") {
chunkOverallOperation = "replace"; // Replace previous stage's content
textForThisChunk = ""; // Start fresh for this chunk
}
newProcessingStage = "processing_thoughts";
if (messageNode.content.thoughts && Array.isArray(messageNode.content.thoughts)) {
messageNode.content.thoughts.forEach(thought => {
if (thought.summary) currentLineText += thought.summary + "\n";
if (thought.content) currentLineText += thought.content + "\n";
});
}
console.log(`[${this.name}] parseDebuggerResponse (Thinking TRUE) - THOUGHTS: "${currentLineText.substring(0,50)}..."`);
} else if (contentType === "reasoning_recap") {
if (newProcessingStage === "processing_thoughts") {
newProcessingStage = "awaiting_content"; // Thoughts ended, expecting content
}
// No text from recap itself
console.log(`[${this.name}] parseDebuggerResponse (Thinking TRUE) - REASONING_RECAP. New stage: ${newProcessingStage}`);
} else if (contentType === "text") {
if (newProcessingStage !== "processing_content") {
chunkOverallOperation = "replace"; // Replace previous stage's content
textForThisChunk = ""; // Start fresh for this chunk
}
newProcessingStage = "processing_content";
if (messageNode.content.parts && messageNode.content.parts.length > 0 && typeof messageNode.content.parts[0] === 'string') {
currentLineText = messageNode.content.parts[0];
currentLineIsReplaceOperation = true; // A full text part replaces
}
console.log(`[${this.name}] parseDebuggerResponse (Thinking TRUE) - TEXT: "${currentLineText.substring(0,50)}..."`);
}
} else {
// --- INCLUDE THINKING FALSE: Skip thoughts, only process text ---
if (contentType === "thoughts") {
newProcessingStage = "processing_thoughts";
textForThisChunk = ""; // Ensure no text from thoughts is carried
chunkOverallOperation = "replace"; // Next "text" content should replace this empty string
console.log(`[${this.name}] parseDebuggerResponse (Thinking FALSE) - SKIPPING THOUGHTS. Stage: ${newProcessingStage}. Chunk op: ${chunkOverallOperation}`);
// Check for finality even in thoughts
if (messageNode && messageNode.status === "finished_successfully" && messageNode.end_turn === true) isFinalResponse = true;
continue;
} else if (contentType === "reasoning_recap") {
if (newProcessingStage === "processing_thoughts") {
newProcessingStage = "awaiting_content";
}
console.log(`[${this.name}] parseDebuggerResponse (Thinking FALSE) - SKIPPING REASONING_RECAP. Stage: ${newProcessingStage}`);
if (messageNode && messageNode.status === "finished_successfully" && messageNode.end_turn === true) isFinalResponse = true;
continue;
} else if (contentType === "text") {
if (newProcessingStage === "processing_thoughts" || newProcessingStage === "awaiting_content" || newProcessingStage === undefined) {
chunkOverallOperation = "replace"; // This is the first actual content, replace anything prior (e.g. empty from thoughts)
textForThisChunk = ""; // Ensure we start fresh for this chunk if replacing
}
newProcessingStage = "processing_content";
if (messageNode.content.parts && messageNode.content.parts.length > 0 && typeof messageNode.content.parts[0] === 'string') {
currentLineText = messageNode.content.parts[0];
currentLineIsReplaceOperation = true; // A full text part
}
console.log(`[${this.name}] parseDebuggerResponse (Thinking FALSE) - TEXT: "${currentLineText.substring(0,50)}...". Stage: ${newProcessingStage}. Chunk op: ${chunkOverallOperation}`);
}
}
// JSON Patch operations (apply to both includeThinking true/false if it's for content parts)
if (data.p === "" && data.o === "patch" && Array.isArray(data.v)) {
for (const patch of data.v) {
if (patch.p === "/message/content/parts/0" && typeof patch.v === 'string') {
// If we are not including thinking, and we haven't hit a "text" content type yet, this patch might be the first "text"
if (!this.includeThinkingInMessage && newProcessingStage !== "processing_content") {
if (newProcessingStage === "processing_thoughts" || newProcessingStage === "awaiting_content" || newProcessingStage === undefined) {
chunkOverallOperation = "replace";
textForThisChunk = ""; // Start fresh
}
newProcessingStage = "processing_content"; // Patches to content/parts/0 mean we are in content
console.log(`[${this.name}] parseDebuggerResponse - Patch to content/parts/0, transitioning to 'processing_content'. Chunk op: ${chunkOverallOperation}`);
}
// If including thinking, and current stage is not content, this patch might be the first content
else if (this.includeThinkingInMessage && newProcessingStage !== "processing_content") {
chunkOverallOperation = "replace"; // Replace thoughts
textForThisChunk = ""; // Start fresh
newProcessingStage = "processing_content";
console.log(`[${this.name}] parseDebuggerResponse (Thinking TRUE) - Patch to content/parts/0, transitioning to 'processing_content'. Chunk op: ${chunkOverallOperation}`);
}
if (patch.o === "append") {
currentLineText += patch.v;
currentLineIsReplaceOperation = false; // Append to current line's text
} else if (patch.o === "replace") {
currentLineText = patch.v;
currentLineIsReplaceOperation = true; // Replace current line's text
}
console.log(`[${this.name}] parseDebuggerResponse - Patch applied. currentLineText: "${currentLineText.substring(0,50)}...", currentLineIsReplaceOp: ${currentLineIsReplaceOperation}`);
}
// Finality from patch metadata
if ((patch.p === "/message/metadata/finish_details/type" && patch.v === "stop") ||
(patch.p === "/message/metadata/finish_reason" && patch.v === "stop") ||
(patch.p === "/message/status" && patch.v === "finished_successfully")) {
isFinalResponse = true;
}
}
}
// Direct operations on content parts (e.g., from o3 model logs)
else if (data.p === "/message/content/parts/0" && typeof data.v === 'string' && (this.includeThinkingInMessage || newProcessingStage === "processing_content" || newProcessingStage === undefined)) {
if (!this.includeThinkingInMessage && newProcessingStage !== "processing_content") {
if (newProcessingStage === "processing_thoughts" || newProcessingStage === "awaiting_content" || newProcessingStage === undefined) {
chunkOverallOperation = "replace";
textForThisChunk = "";
}
newProcessingStage = "processing_content";
} else if (this.includeThinkingInMessage && newProcessingStage !== "processing_content") {
chunkOverallOperation = "replace";
textForThisChunk = "";
newProcessingStage = "processing_content";
}
if (data.o === "replace") {
currentLineText = data.v;
currentLineIsReplaceOperation = true;
} else if (data.o === "append") {
currentLineText = data.v;
currentLineIsReplaceOperation = false;
}
console.log(`[${this.name}] parseDebuggerResponse - Direct op on content/parts/0. currentLineText: "${currentLineText.substring(0,50)}...", currentLineIsReplaceOp: ${currentLineIsReplaceOperation}`);
}
// Simple delta format (e.g., data: {"v": " some text"}) - common in 4o
else if (typeof data.v === 'string' && data.p === undefined && data.o === undefined && !contentType) {
// This is likely a text delta if no specific content_type was identified yet.
// Treat as content if we are not explicitly in 'thoughts' when includeThinkingInMessage is false.
if (!this.includeThinkingInMessage && newProcessingStage !== "processing_content") {
if (newProcessingStage === "processing_thoughts" || newProcessingStage === "awaiting_content" || newProcessingStage === undefined) {
chunkOverallOperation = "replace";
textForThisChunk = "";
}
newProcessingStage = "processing_content";
} else if (this.includeThinkingInMessage && newProcessingStage !== "processing_content" && newProcessingStage !== "processing_thoughts") {
// If including thinking, but not in thoughts or content, this is likely start of content
chunkOverallOperation = "replace";
textForThisChunk = "";
newProcessingStage = "processing_content";
}
currentLineText = data.v;
currentLineIsReplaceOperation = false; // Assume append for simple deltas unless it's the first part of content
console.log(`[${this.name}] parseDebuggerResponse - Simple delta {"v": ...}. currentLineText: "${currentLineText.substring(0,50)}..."`);
}
// Fallback for OpenAI standard delta (choices...delta.content)
else if (data.choices && data.choices[0] && data.choices[0].delta && typeof data.choices[0].delta.content === 'string') {
if (!this.includeThinkingInMessage && newProcessingStage !== "processing_content") {
if (newProcessingStage === "processing_thoughts" || newProcessingStage === "awaiting_content" || newProcessingStage === undefined) {
chunkOverallOperation = "replace";
textForThisChunk = "";
}
newProcessingStage = "processing_content";
} else if (this.includeThinkingInMessage && newProcessingStage !== "processing_content" && newProcessingStage !== "processing_thoughts") {
chunkOverallOperation = "replace";
textForThisChunk = "";
newProcessingStage = "processing_content";
}
currentLineText = data.choices[0].delta.content;
currentLineIsReplaceOperation = false;
console.log(`[${this.name}] parseDebuggerResponse - OpenAI delta. currentLineText: "${currentLineText.substring(0,50)}..."`);
}
// Accumulate text for this chunk based on operations
if (currentLineText) {
if (textForThisChunk === null) textForThisChunk = ""; // Initialize if null
if (currentLineIsReplaceOperation) { // If this line's content is a replacement for the chunk
textForThisChunk = currentLineText;
// If this is the first text part of the chunk, and we decided the chunk should replace, it's already set.
// If not, this specific line replaces previous lines *within this chunk*.
} else {
textForThisChunk += currentLineText;
}
}
console.log(`[${this.name}] parseDebuggerResponse - After line processing. textForThisChunk: "${textForThisChunk ? textForThisChunk.substring(0,70) : "null"}...", chunkOverallOperation: ${chunkOverallOperation}`);
// General finality checks
if (messageNode) {
if (messageNode.metadata && messageNode.metadata.finish_details && messageNode.metadata.finish_details.type === "stop") isFinalResponse = true;
if (messageNode.status === "finished_successfully" && messageNode.end_turn === true) isFinalResponse = true;
}
if (data.choices && data.choices[0] && data.choices[0].finish_reason === 'stop') isFinalResponse = true;
} catch (e) { console.warn(`[${this.name}] parseDebuggerResponse - Error parsing dataJson from line: '${line}'. dataJson: '${dataJson}'. Error:`, e); }
} else if (line.trim() === "" || line.startsWith("event:") || line.startsWith("id:")) {
continue;
} else if (line.trim()) { console.warn(`[${this.name}] parseDebuggerResponse - Unexpected non-data SSE line: ${line}`); }
}
console.log(`[${this.name}] parseDebuggerResponse FINISHING. Returning: text: "${textForThisChunk ? textForThisChunk.substring(0,100) + "..." : "null"}", isFinal: ${isFinalResponse}, operation: "${chunkOverallOperation}", newStage: ${newProcessingStage}`);
return { text: textForThisChunk, isFinalResponse: isFinalResponse, operation: chunkOverallOperation, newProcessingStage };
}
formatOutput(thinkingText, answerText) {
if (this.includeThinkingInMessage && thinkingText && thinkingText.trim() !== "") {
try {
const result = {
thinking: thinkingText.trim(),
answer: (answerText || "").trim()
};
return JSON.stringify(result);
} catch (e) {
console.error(`[${this.name}] Error stringifying thinking/answer object:`, e);
return (answerText || "").trim();
}
}
return (answerText || "").trim();
}
_captureResponseDOM(element = null) {
if (!element && this.captureMethod === "dom") {
const elements = document.querySelectorAll(this.responseSelector);
if (elements.length > 0) {
element = elements[elements.length - 1];
}
}
if (!element) {
return { text: null, isStillGenerating: false };
}
let responseText = element.innerText || element.textContent || "";
if (this.lastSentMessage && responseText.trim().startsWith(this.lastSentMessage.trim())) {
const potentialActualResponse = responseText.substring(this.lastSentMessage.length).trim();
if (potentialActualResponse === "") {
return { text: null, isStillGenerating: this._isResponseStillGeneratingDOM() };
}
}
const isStillGenerating = this._isResponseStillGeneratingDOM();
if (responseText && responseText.trim() !== "" && responseText.trim() !== this.lastSentMessage.trim()) {
return {
text: this.formatOutput("", responseText),
isStillGenerating: isStillGenerating
};
}
return { text: null, isStillGenerating: isStillGenerating };
}
_isResponseStillGeneratingDOM() {
if (this.thinkingIndicatorSelector && document.querySelector(this.thinkingIndicatorSelector)) {
return true;
}
if (this.thinkingIndicatorSelectorForDOM && document.querySelector(this.thinkingIndicatorSelectorForDOM)) {
return true;
}
return false;
}
_startDOMMonitoring(requestId) {
console.log(`[${this.name}] Starting DOM monitoring for requestId: ${requestId}. Interval: 500ms.`);
let lastCapturedText = "";
let lastCheckTime = Date.now();
let noChangeStreak = 0;
const monitor = () => {
const callback = this.pendingResponseCallbacks.get(requestId);
if (!callback) {
console.log(`[${this.name}] DOM monitor: Callback for ${requestId} no longer exists. Stopping.`);
this._stopDOMMonitoring();
return;
}
const captureResult = this._captureResponseDOM();
const currentText = captureResult.text;
const isStillGenerating = captureResult.isStillGenerating;
let isFinalDOMResponse = false;
if (currentText && currentText !== lastCapturedText) {
console.log(`[${this.name}] DOM monitor (ReqID: ${requestId}): New content detected. Length: ${currentText.length}. Last length: ${lastCapturedText.length}. Still generating: ${isStillGenerating}`);
lastCapturedText = currentText;
noChangeStreak = 0;
callback(requestId, currentText, false);
} else if (currentText && currentText === lastCapturedText) {
noChangeStreak++;
} else if (!currentText) {
noChangeStreak++;
}
const STABILITY_CHECKS = 4;
if (!isStillGenerating && noChangeStreak >= STABILITY_CHECKS && lastCapturedText.trim() !== "") {
console.log(`[${this.name}] DOM monitor (ReqID: ${requestId}): Response appears stable and complete. No generating indicator, and ${noChangeStreak} unchanged checks.`);
isFinalDOMResponse = true;
}
const MAX_WAIT_AFTER_NO_GENERATING = 5000;
if (!isStillGenerating && lastCapturedText.trim() !== "" && (Date.now() - lastCheckTime > MAX_WAIT_AFTER_NO_GENERATING) && noChangeStreak > 0) {
console.log(`[${this.name}] DOM monitor (ReqID: ${requestId}): Max wait time reached after no 'generating' signal. Assuming final.`);
isFinalDOMResponse = true;
}
if (isFinalDOMResponse) {
console.log(`[${this.name}] DOM monitor (ReqID: ${requestId}): Sending final response. Text length: ${lastCapturedText.length}`);
callback(requestId, lastCapturedText, true);
this.pendingResponseCallbacks.delete(requestId);
this._stopDOMMonitoring();
} else {
lastCheckTime = Date.now();
this.domMonitorTimer = setTimeout(monitor, 500);
}
};
this.domMonitorTimer = setTimeout(monitor, 100);
}
_stopDOMMonitoring() {
if (this.domMonitorTimer) {
clearTimeout(this.domMonitorTimer);
this.domMonitorTimer = null;
console.log(`[${this.name}] DOM monitoring stopped.`);
}
}
shouldSkipResponseMonitoring(inputText) {
return false;
}
getStreamingApiPatterns() {
if (this.captureMethod === "debugger" && this.debuggerUrlPattern) {
return [{ urlPattern: this.debuggerUrlPattern, requestStage: "Response" }];
}
return [];
}
stopStreaming(requestId) {
console.log(`[${this.name}] stopStreaming called for requestId: ${requestId}`);
const callback = this.pendingResponseCallbacks.get(requestId);
const accumulator = this.requestAccumulators.get(requestId);
let lastKnownText = "";
if (accumulator && typeof accumulator.text === 'string') {
lastKnownText = accumulator.text;
}
if (callback) {
// Send one final message indicating it was stopped, using the last known accumulated text.
console.log(`[${this.name}] stopStreaming - Invoking callback for ${requestId} with final=true and STREAM_STOPPED_BY_USER. Last text: "${lastKnownText.substring(0,50)}..."`);
callback(requestId, `${lastKnownText}[STREAM_STOPPED_BY_USER]`, true);
} else {
console.warn(`[${this.name}] stopStreaming - No pending callback found for requestId: ${requestId} when attempting to stop.`);
}
// Clean up
if (this.pendingResponseCallbacks.has(requestId)) {
this.pendingResponseCallbacks.delete(requestId);
console.log(`[${this.name}] stopStreaming - Deleted pendingResponseCallback for ${requestId}.`);
}
if (this.requestAccumulators.has(requestId)) {
this.requestAccumulators.delete(requestId);
console.log(`[${this.name}] stopStreaming - Deleted requestAccumulator for ${requestId}.`);
}
// If DOM monitoring was active for this request (though less likely if debugger is primary)
if (this.domMonitorTimer && this.captureMethod === "dom") { // Check if this request was the one being monitored
// This is a bit tricky as domMonitorTimer isn't directly tied to a requestId in its current form.
// For now, we'll assume a general stop might also stop DOM monitoring if it was the active one.
// A more robust solution would tie DOM monitor to a specific requestId.
// For debugger method, this part is less relevant.
console.log(`[${this.name}] stopStreaming - Stopping DOM monitoring if it was active (relevant for DOM capture method).`);
this._stopDOMMonitoring();
}
console.log(`[${this.name}] stopStreaming - Cleanup complete for requestId: ${requestId}.`);
}
}
if (window.providerUtils && window.providerUtils.registerProvider) {
const providerInstance = new ChatGptProvider();
window.providerUtils.registerProvider(
providerInstance.name,
providerInstance.supportedDomains,
providerInstance
);
console.log(`[${providerInstance.name}] Provider registered with providerUtils.`);
} else {
console.error("[ChatGptProvider] providerUtils not found. Registration failed. Ensure provider-utils.js is loaded before chatgpt.js");
}

View file

@ -0,0 +1,574 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
// AI Chat Relay - Claude Provider
class ClaudeProvider {
constructor() {
// --- START OF CONFIGURABLE PROPERTIES ---
// Method for response capture: "debugger" or "dom"
this.captureMethod = "debugger";
// URL pattern for debugger to intercept if captureMethod is "debugger". Ensure this is specific.
this.debuggerUrlPattern = "*/completion*"; // VERIFY THIS PATTERN
// Whether to include "thinking" process in the message or just the final answer.
// If true, parseDebuggerResponse returns a JSON string: { "thinking": "...", "answer": "..." }
// If false, parseDebuggerResponse returns a string: "answer"
this.includeThinkingInMessage = false;
// Option to enable AI Studio function calling on load
// ENABLE_CLAUDE_FUNCTION_CALLING: true or false
this.ENABLE_CLAUDE_FUNCTION_CALLING = true;
// --- END OF CONFIGURABLE PROPERTIES ---
this.name = "ClaudeProvider"; // Updated name
this.supportedDomains = ["claude.ai"];
// Selectors for the AI Studio interface
this.inputSelector = 'div.ProseMirror[contenteditable="true"]';
// The send button selector
this.sendButtonSelector = 'button[aria-label="Send message"]';
// Updated response selectors based on the actual elements
this.responseSelector = '.response-container, .response-text, .model-response, .model-response-container, ms-chat-turn, ms-prompt-chunk, ms-text-chunk, .very-large-text-container, .cmark-node';
// Thinking indicator selector
this.thinkingIndicatorSelector = '.thinking-indicator, .loading-indicator, .typing-indicator, .response-loading, loading-indicator';
// Fallback selectors
this.responseSelectorForDOMFallback = '.response-container, .model-response-text'; // Placeholder, adjust as needed
this.thinkingIndicatorSelectorForDOM = '.thinking-indicator, .spinner'; // Placeholder, adjust as needed
// Last sent message to avoid capturing it as a response
this.lastSentMessage = '';
// Initialize pendingResponseCallbacks
this.pendingResponseCallbacks = new Map();
this.requestBuffers = new Map(); // To accumulate text for each request
// Call the method to ensure function calling is enabled on initial load
// this.ensureFunctionCallingEnabled(); // Commented out as per user request
// Listen for SPA navigation events to re-trigger the check
// if (window.navigation) {
// window.navigation.addEventListener('navigate', (event) => {
// // We are interested in same-document navigations, common in SPAs
// if (!event.canIntercept || event.hashChange || event.downloadRequest !== null) {
// return;
// }
// // Check if the navigation is within the same origin and path structure of AI Studio
// const currentUrl = new URL(window.location.href);
// const destinationUrl = new URL(event.destination.url);
// if (currentUrl.origin === destinationUrl.origin && destinationUrl.pathname.startsWith("/prompts/")) {
// console.log(`[${this.name}] Detected SPA navigation to: ${event.destination.url}. Re-checking function calling toggle.`);
// // Use a timeout to allow the new view's DOM to settle
// setTimeout(() => {
// // this.ensureFunctionCallingEnabled(); // Commented out
// }, 1000); // Delay to allow DOM update
// }
// });
// } else {
// console.warn(`[${this.name}] window.navigation API not available. Function calling toggle may not re-enable on SPA navigations.`);
// }
} // This curly brace correctly closes the constructor.
/* // Commenting out the entire method as per user request
ensureFunctionCallingEnabled() {
if (!this.ENABLE_CLAUDE_FUNCTION_CALLING) {
console.log(`[${this.name}] Function calling is disabled by configuration. Skipping.`);
return;
}
const checkInterval = 500; // ms
const maxDuration = 7000; // ms
let elapsedTime = 0;
const providerName = this.name;
// Clear any existing timer for this specific functionality to avoid multiple polling loops
if (this.functionCallingPollTimer) {
clearTimeout(this.functionCallingPollTimer);
this.functionCallingPollTimer = null;
console.log(`[${providerName}] Cleared previous function calling poll timer.`);
}
console.log(`[${providerName}] Ensuring function calling is enabled (polling up to ${maxDuration / 1000}s).`);
const tryEnableFunctionCalling = () => {
console.log(`[${providerName}] Polling for function calling toggle. Elapsed: ${elapsedTime}ms`);
const functionCallingToggle = document.querySelector('button[aria-label="Function calling"]');
if (functionCallingToggle) {
const isChecked = functionCallingToggle.getAttribute('aria-checked') === 'true';
if (!isChecked) {
console.log(`[${providerName}] Function calling toggle found and is NOT checked. Attempting to enable...`);
functionCallingToggle.click();
// Verify after a short delay if the click was successful
setTimeout(() => {
const stillChecked = functionCallingToggle.getAttribute('aria-checked') === 'true';
if (stillChecked) {
console.log(`[${providerName}] Function calling successfully enabled after click.`);
} else {
console.warn(`[${providerName}] Clicked function calling toggle, but it did NOT become checked. It might be disabled or unresponsive.`);
}
}, 200);
} else {
console.log(`[${providerName}] Function calling toggle found and is already enabled.`);
}
this.functionCallingPollTimer = null; // Clear timer once action is taken or element found
} else {
elapsedTime += checkInterval;
if (elapsedTime < maxDuration) {
console.log(`[${providerName}] Function calling toggle not found, will retry in ${checkInterval}ms.`);
this.functionCallingPollTimer = setTimeout(tryEnableFunctionCalling, checkInterval);
} else {
console.warn(`[${providerName}] Function calling toggle button (selector: 'button[aria-label="Function calling"]') not found after ${maxDuration}ms. It might not be available on this page/view or selector is incorrect.`);
this.functionCallingPollTimer = null; // Clear timer
}
}
};
// Start the first attempt after a brief initial delay
this.functionCallingPollTimer = setTimeout(tryEnableFunctionCalling, 500);
}
*/
// Send a message to the chat interface
async sendChatMessage(messageContent) {
console.log(`[${this.name}] sendChatMessage called with content type:`, typeof messageContent, Array.isArray(messageContent) ? `Array length: ${messageContent.length}` : '');
const inputField = document.querySelector(this.inputSelector);
const sendButton = document.querySelector(this.sendButtonSelector);
if (!inputField || !sendButton) {
console.error(`[${this.name}] Missing input field or send button. Input: ${this.inputSelector}, Button: ${this.sendButtonSelector}`);
return false;
}
console.log(`[${this.name}] Attempting to send message to AI Studio with:`, {
inputField: inputField.className,
sendButton: sendButton.getAttribute('aria-label') || sendButton.className
});
try {
let textToInput = "";
let blobToPaste = null;
let blobMimeType = "image/png"; // Default MIME type
if (typeof messageContent === 'string') {
textToInput = messageContent;
this.lastSentMessage = textToInput;
console.log(`[${this.name}] Handling string content:`, textToInput.substring(0, 100) + "...");
} else if (messageContent instanceof Blob) {
blobToPaste = messageContent;
blobMimeType = messageContent.type || blobMimeType;
this.lastSentMessage = `Blob data (type: ${blobMimeType}, size: ${blobToPaste.size})`;
console.log(`[${this.name}] Handling Blob content. Size: ${blobToPaste.size}, Type: ${blobMimeType}`);
} else if (Array.isArray(messageContent)) {
console.log(`[${this.name}] Handling array content.`);
for (const part of messageContent) {
if (part.type === "text" && typeof part.text === 'string') {
textToInput += (textToInput ? "\n" : "") + part.text;
console.log(`[${this.name}] Added text part:`, part.text.substring(0, 50) + "...");
} else if (part.type === "image_url" && part.image_url && typeof part.image_url.url === 'string') {
if (!blobToPaste) { // Prioritize the first image found
try {
const response = await fetch(part.image_url.url);
blobToPaste = await response.blob();
blobMimeType = blobToPaste.type || blobMimeType;
console.log(`[${this.name}] Fetched image_url as Blob. Size: ${blobToPaste.size}, Type: ${blobMimeType}`);
} catch (e) {
console.error(`[${this.name}] Error fetching image_url ${part.image_url.url}:`, e);
}
} else {
console.warn(`[${this.name}] Multiple image_urls found, only the first will be pasted.`);
}
}
}
this.lastSentMessage = `Array content (Text: "${textToInput.substring(0,50)}...", Image: ${blobToPaste ? 'Yes' : 'No'})`;
} else {
console.error(`[${this.name}] Unhandled message content type: ${typeof messageContent}. Cannot send.`);
this.lastSentMessage = `Unhandled data type: ${typeof messageContent}`;
return false;
}
// Set text input if any
if (textToInput) {
inputField.textContent = textToInput; // Use textContent for contenteditable div
inputField.dispatchEvent(new Event('input', { bubbles: true }));
console.log(`[${this.name}] Set input field textContent with accumulated text.`);
} else {
// If there's no text but an image, ensure the input field is clear
inputField.textContent = "";
inputField.dispatchEvent(new Event('input', { bubbles: true }));
}
// Paste blob if any
if (blobToPaste) {
const dataTransfer = new DataTransfer();
const file = new File([blobToPaste], "pasted_image." + (blobMimeType.split('/')[1] || 'png'), { type: blobMimeType });
dataTransfer.items.add(file);
const pasteEvent = new ClipboardEvent('paste', {
clipboardData: dataTransfer,
bubbles: true,
cancelable: true
});
inputField.dispatchEvent(pasteEvent);
console.log(`[${this.name}] Dispatched paste event with Blob data.`);
}
inputField.focus();
await new Promise(resolve => setTimeout(resolve, 100));
let attempts = 0;
const maxAttempts = 60; // Try up to 60 times (5 minutes total)
const retryDelay = 5000; // 5 seconds delay between attempts
while (attempts < maxAttempts) {
const isDisabled = sendButton.disabled ||
sendButton.getAttribute('aria-disabled') === 'true' ||
sendButton.classList.contains('disabled');
if (!isDisabled) {
// Removed check for input field content matching lastSentMessage
// as it can cause issues when there are multiple messages waiting to be sent
console.log(`[${this.name}] Send button is enabled. Clicking send button (attempt ${attempts + 1}).`);
sendButton.click();
return true; // Successfully clicked
}
attempts++;
if (attempts >= maxAttempts) {
console.error(`[${this.name}] Send button remained disabled after ${maxAttempts} attempts. Failed to send message.`);
return false; // Failed to send
}
console.log(`[${this.name}] Send button is disabled (attempt ${attempts}). Trying to enable and will retry in ${retryDelay}ms.`);
// Attempt to trigger UI updates that might enable the button
inputField.dispatchEvent(new Event('input', { bubbles: true })); // Re-dispatch input
inputField.dispatchEvent(new Event('change', { bubbles: true }));
inputField.dispatchEvent(new Event('blur', { bubbles: true }));
// Focusing and bluring input sometimes helps enable send buttons
inputField.focus();
await new Promise(resolve => setTimeout(resolve, 50)); // Short delay for focus
inputField.blur();
await new Promise(resolve => setTimeout(resolve, retryDelay));
}
// Should not be reached if logic is correct, but as a fallback:
console.error(`[${this.name}] Exited send button check loop unexpectedly.`);
return false;
} catch (error) {
console.error(`[${this.name}] Error sending message to AI Studio:`, error);
return false;
}
}
initiateResponseCapture(requestId, responseCallback) {
console.log(`[${this.name}] initiateResponseCapture called for requestId: ${requestId}. CURRENT CAPTURE METHOD IS: ${this.captureMethod}`);
if (this.captureMethod === "debugger") {
this.pendingResponseCallbacks.set(requestId, responseCallback);
console.log(`[${this.name}] Stored callback for debugger response, requestId: ${requestId}`);
} else if (this.captureMethod === "dom") {
console.log(`[${this.name}] Starting DOM monitoring for requestId: ${requestId}`);
this.pendingResponseCallbacks.set(requestId, responseCallback);
this._stopDOMMonitoring();
this._startDOMMonitoring(requestId);
} else {
console.error(`[${this.name}] Unknown capture method: ${this.captureMethod}`);
responseCallback(requestId, `[Error: Unknown capture method '${this.captureMethod}' in provider]`, true);
this.pendingResponseCallbacks.delete(requestId);
}
}
handleDebuggerData(requestId, rawData, isFinalFromBackground) {
// !!!!! VERY IMPORTANT ENTRY LOG !!!!!
console.log('[[ClaudeProvider]] handleDebuggerData ENTERED. RequestId: ' + requestId + ', isFinalFromBackground: ' + isFinalFromBackground + ', RawData Length: ' + (rawData ? rawData.length : 'null'));
const callback = this.pendingResponseCallbacks.get(requestId);
if (!callback) {
console.warn('[' + this.name + '] No pending callback for requestId: ' + requestId + '. Ignoring.');
if (this.requestBuffers.has(requestId)) {
this.requestBuffers.delete(requestId);
}
return;
}
if (!this.requestBuffers.has(requestId)) {
this.requestBuffers.set(requestId, { accumulatedText: "" });
}
const requestBuffer = this.requestBuffers.get(requestId);
let textFromCurrentChunk = "";
let isLogicalEndOfMessageInChunk = false;
if (rawData && rawData.trim() !== "") {
console.log('[' + this.name + '] handleDebuggerData: Processing rawData for ' + requestId + '. Accumulated before: ' + requestBuffer.accumulatedText.length);
const parseOutput = this.parseDebuggerResponse(rawData, requestId);
textFromCurrentChunk = parseOutput.text;
isLogicalEndOfMessageInChunk = parseOutput.isFinalResponse;
if (textFromCurrentChunk) {
requestBuffer.accumulatedText += textFromCurrentChunk;
}
console.log('[' + this.name + '] handleDebuggerData: Parsed chunk for ' + requestId + '. TextInChunk: ' + (textFromCurrentChunk ? textFromCurrentChunk.substring(0,50) : 'N/A') + '..., LogicalEndInChunk: ' + isLogicalEndOfMessageInChunk + '. Accumulated after: ' + requestBuffer.accumulatedText.length);
} else {
console.log('[' + this.name + '] handleDebuggerData: Received empty/whitespace rawData for ' + requestId + '. isFinalFromBackground: ' + isFinalFromBackground + '. Accumulated: ' + requestBuffer.accumulatedText.length);
}
const shouldSendFinalResponse = isLogicalEndOfMessageInChunk || (isFinalFromBackground && !isLogicalEndOfMessageInChunk);
console.log('[' + this.name + '] handleDebuggerData: Eval for ' + requestId + '. LogicalEnd: ' + isLogicalEndOfMessageInChunk + ', isFinalBG: ' + isFinalFromBackground + ', includeThinking: ' + this.includeThinkingInMessage + ', AccLen: ' + requestBuffer.accumulatedText.length + '. ShouldSendFinal: ' + shouldSendFinalResponse);
if (shouldSendFinalResponse) {
console.log('[' + this.name + '] handleDebuggerData: FINAL RESPONSE condition met for ' + requestId + '. Sending full accumulated text. Length: ' + requestBuffer.accumulatedText.length);
callback(requestId, requestBuffer.accumulatedText, true);
this.pendingResponseCallbacks.delete(requestId);
this.requestBuffers.delete(requestId);
} else if (this.includeThinkingInMessage && textFromCurrentChunk) {
console.log('[' + this.name + '] handleDebuggerData: Sending INTERMEDIATE chunk for ' + requestId + '. Text: ' + (textFromCurrentChunk ? textFromCurrentChunk.substring(0,50) : 'N/A') + '...');
callback(requestId, textFromCurrentChunk, false);
} else {
console.log('[' + this.name + '] handleDebuggerData: Not sending response for ' + requestId + ' YET.');
}
}
// --- START OF CLAUDE SSE DEBUGGER PARSING LOGIC ---
parseDebuggerResponse(sseChunk, requestIdForLog = 'unknown') {
// console.log('[' + this.name + '] Parsing Claude SSE chunk for reqId ' + requestIdForLog + ' (first 300): ' + (sseChunk ? sseChunk.substring(0, 300) : "null"));
let extractedTextThisChunk = "";
let isEndOfMessageEventInThisChunk = false;
const sseMessages = sseChunk.split('\n\n');
for (const sseMessage of sseMessages) {
if (sseMessage.trim() === "") continue;
let eventType = null;
let jsonDataString = null;
const lines = sseMessage.split('\n');
for (const line of lines) {
if (line.startsWith("event:")) {
eventType = line.substring("event:".length).trim();
} else if (line.startsWith("data:")) {
jsonDataString = line.substring("data:".length).trim();
}
}
if (eventType === "message_stop") {
console.log('[' + this.name + '] ReqId ' + requestIdForLog + ' - Event: "message_stop" detected. Marking EOM.');
isEndOfMessageEventInThisChunk = true;
} else if (eventType && jsonDataString) {
try {
const dataObject = JSON.parse(jsonDataString);
if (eventType === "content_block_delta") {
if (dataObject.delta && dataObject.delta.type === "text_delta" && typeof dataObject.delta.text === 'string') {
extractedTextThisChunk += dataObject.delta.text;
}
} else if (eventType === "message_delta") {
if (dataObject.delta && dataObject.delta.stop_reason) {
console.log('[' + this.name + '] ReqId ' + requestIdForLog + ' - Event: "message_delta" with stop_reason: ' + dataObject.delta.stop_reason + '. Marking EOM.');
isEndOfMessageEventInThisChunk = true;
}
}
} catch (e) {
console.warn('[' + this.name + '] ReqId ' + requestIdForLog + ' - Error parsing JSON from Claude SSE event \'' + eventType + '\':', e, "JSON Data:", jsonDataString);
}
}
}
// console.log('[' + this.name + '] parseDebuggerResponse for reqId ' + requestIdForLog + ' result: Text: "' + (extractedTextThisChunk ? extractedTextThisChunk.substring(0,50) : "N/A") + '...", isEOM: ' + isEndOfMessageEventInThisChunk);
return { text: extractedTextThisChunk, isFinalResponse: isEndOfMessageEventInThisChunk };
}
// --- END OF CLAUDE SSE DEBUGGER PARSING LOGIC ---
formatOutput(thinkingText, answerText) {
if (this.includeThinkingInMessage && thinkingText && thinkingText.trim() !== "") {
try {
const result = {
thinking: thinkingText.trim(),
answer: (answerText || "").trim()
};
return JSON.stringify(result);
} catch (e) {
console.error(`[${this.name}] Error stringifying thinking/answer object:`, e);
return (answerText || "").trim();
}
}
return (answerText || "").trim();
}
// --- Other methods (DOM fallback, etc. - largely unchanged but included for completeness) ---
_captureResponseDOM(element = null) {
console.log(`[${this.name}] _captureResponseDOM (DOM method) called with element:`, element);
if (!element && this.captureMethod === "dom") {
const elements = document.querySelectorAll(this.responseSelector);
if (elements.length > 0) {
element = elements[elements.length - 1];
console.log(`[${this.name}] _captureResponseDOM: Found element via querySelector during polling.`);
}
}
if (!element) {
console.log(`[${this.name}] _captureResponseDOM: No element provided or found.`);
return { found: false, text: '' };
}
if (this._isResponseStillGeneratingDOM()) {
console.log(`[${this.name}] Response is still being generated (_isResponseStillGeneratingDOM check), waiting for completion`);
return { found: false, text: '' };
}
console.log(`[${this.name}] Attempting to capture DOM response from Claude...`);
let responseText = "";
let foundResponse = false;
try {
// Simplified DOM capture for Claude - assumes response is in a known container
// This part would need to be specific to Claude's DOM structure if DOM capture is used.
// For now, focusing on debugger method.
const responseElements = document.querySelectorAll(this.responseSelectorForDOMFallback); // Use appropriate selector
if (responseElements.length > 0) {
const lastResponseElement = responseElements[responseElements.length -1];
// Check if it's a model response and not user input etc.
// This is highly dependent on Claude's actual DOM structure.
// Example:
// if (lastResponseElement.closest('.message-row[data-role="assistant"]')) {
// responseText = lastResponseElement.textContent.trim();
// foundResponse = true;
// }
responseText = lastResponseElement.textContent.trim(); // Placeholder
if (responseText && responseText !== this.lastSentMessage) {
foundResponse = true;
}
}
if (!foundResponse) {
console.log("CLAUDE (DOM): Response not found yet.");
}
} catch (error) {
console.error("CLAUDE (DOM): Error capturing response:", error);
}
if (foundResponse && responseText) {
responseText = responseText.trim()
.replace(/^(Loading|Thinking).*/gim, '') // General cleanup
.replace(/\n{3,}/g, '\n\n')
.trim();
}
return {
found: foundResponse && !!responseText.trim(),
text: responseText
};
}
_findResponseElementDOM(container) {
console.log(`[${this.name}] _findResponseElementDOM called on container:`, container);
if (!container) return null;
const elements = container.querySelectorAll(this.responseSelectorForDOMFallback);
if (elements.length > 0) {
const lastElement = elements[elements.length - 1];
console.log(`[${this.name}] Found last response element via DOM:`, lastElement);
// Add checks to ensure it's not the user's input or an old response
if (lastElement.textContent && lastElement.textContent.trim() !== this.lastSentMessage) {
return lastElement;
}
}
console.log(`[${this.name}] No suitable response element found via DOM in container.`);
return null;
}
shouldSkipResponseMonitoring() {
// Example: if a provider indicates via a specific property or method
// For CLAUDE, if using debugger, we don't need DOM monitoring.
// This method is more for providers that might sometimes use DOM, sometimes not.
// console.log(`[${this.name}] shouldSkipResponseMonitoring called. Capture method: ${this.captureMethod}`);
return this.captureMethod === "debugger";
}
_isResponseStillGeneratingDOM() {
// This is for the DOM fallback method
const thinkingIndicator = document.querySelector(this.thinkingIndicatorSelectorForDOM);
if (thinkingIndicator) {
// console.log(`[${this.name}] DOM Fallback: Thinking indicator found.`);
return true;
}
// console.log(`[${this.name}] DOM Fallback: No thinking indicator found.`);
return false;
}
getStreamingApiPatterns() {
console.log(`[${this.name}] getStreamingApiPatterns called. Capture method: ${this.captureMethod}`);
if (this.captureMethod === "debugger" && this.debuggerUrlPattern) {
console.log(`[${this.name}] Using debugger URL pattern: ${this.debuggerUrlPattern}`);
return [{ urlPattern: this.debuggerUrlPattern, requestStage: "Response" }];
}
console.log(`[${this.name}] No debugger patterns to return (captureMethod is not 'debugger' or no pattern set).`);
return [];
}
_startDOMMonitoring(requestId) {
console.log(`[${this.name}] DOM Fallback: _startDOMMonitoring for requestId: ${requestId}`);
this._stopDOMMonitoring(); // Stop any existing observer
const callback = this.pendingResponseCallbacks.get(requestId);
if (!callback) {
console.error(`[${this.name}] DOM Fallback: No callback for requestId ${requestId} in _startDOMMonitoring.`);
return;
}
let attempts = 0;
const maxAttempts = 15; // Try for ~15 seconds
const interval = 1000;
this.domMonitorTimer = setInterval(() => {
console.log(`[${this.name}] DOM Fallback: Polling attempt ${attempts + 1}/${maxAttempts} for requestId: ${requestId}`);
const responseData = this._captureResponseDOM(); // Will use this.responseSelectorForDOMFallback
if (responseData.found && responseData.text.trim() !== "") {
console.log(`[${this.name}] DOM Fallback: Response captured for requestId ${requestId}. Text (first 100): ${responseData.text.substring(0,100)}`);
this._stopDOMMonitoring();
callback(requestId, responseData.text, true); // Assume final for DOM capture
this.pendingResponseCallbacks.delete(requestId);
} else {
attempts++;
if (attempts >= maxAttempts) {
console.warn(`[${this.name}] DOM Fallback: Max attempts reached for requestId ${requestId}. No response captured.`);
this._stopDOMMonitoring();
callback(requestId, "[Error: Timed out waiting for DOM response]", true); // Error, final
this.pendingResponseCallbacks.delete(requestId);
}
}
}, interval);
console.log(`[${this.name}] DOM Fallback: Monitoring started with timer ID ${this.domMonitorTimer}`);
}
_stopDOMMonitoring() {
if (this.domMonitorTimer) {
console.log(`[${this.name}] DOM Fallback: Stopping DOM monitoring timer ID ${this.domMonitorTimer}`);
clearInterval(this.domMonitorTimer);
this.domMonitorTimer = null;
}
}
}
// Ensure the provider is available on the window for the content script
if (window.providerUtils) {
const providerInstance = new ClaudeProvider();
window.providerUtils.registerProvider(
providerInstance.name,
providerInstance.supportedDomains,
providerInstance
);
} else {
console.error("CLAUDE: providerUtils not found. Registration failed.");
}

View file

@ -0,0 +1,442 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
// AI Chat Relay - Gemini Provider
class GeminiProvider {
constructor() {
this.name = 'GeminiProvider'; // Updated
this.supportedDomains = ['gemini.google.com'];
// Selectors for the Gemini interface
this.inputSelector = 'div.ql-editor, div[contenteditable="true"], textarea[placeholder="Enter a prompt here"], textarea.message-input, textarea.input-area';
this.sendButtonSelector = 'button[aria-label="Send message"], button.send-button, button.send-message-button';
// Response selector - updated to match the actual elements
this.responseSelector = 'model-response, message-content, .model-response-text, .markdown-main-panel, .model-response, div[id^="model-response-message"]';
// Thinking indicator selector
this.thinkingIndicatorSelector = '.thinking-indicator, .loading-indicator, .typing-indicator, .response-loading, .blue-circle, .stop-icon';
// Fallback selectors (NEW)
this.responseSelectorForDOMFallback = 'model-response, message-content, .model-response-text, .markdown-main-panel'; // Placeholder
this.thinkingIndicatorSelectorForDOM = '.thinking-indicator, .loading-indicator, .blue-circle, .stop-icon'; // Placeholder
// Last sent message to avoid capturing it as a response
this.lastSentMessage = '';
// Flag to prevent double-sending - IMPORTANT: This must be false by default
this.hasSentMessage = false;
}
// Send a message to the chat interface (MODIFIED)
async sendChatMessage(text) {
console.log(`[${this.name}] sendChatMessage called with:`, text);
const inputElement = document.querySelector(this.inputSelector);
const sendButton = document.querySelector(this.sendButtonSelector);
if (!inputElement || !sendButton) {
console.error(`[${this.name}] Missing input field (${this.inputSelector}) or send button (${this.sendButtonSelector})`);
return false;
}
console.log(`[${this.name}] Attempting to send message with:`, {
inputFieldInfo: inputElement.outerHTML.substring(0,100),
sendButtonInfo: sendButton.outerHTML.substring(0,100)
});
try {
this.lastSentMessage = text;
console.log(`[${this.name}] Stored last sent message:`, this.lastSentMessage);
if (inputElement.tagName.toLowerCase() === 'div' && (inputElement.contentEditable === 'true' || inputElement.getAttribute('contenteditable') === 'true')) {
console.log(`[${this.name}] Input field is a contentEditable div.`);
inputElement.focus();
inputElement.innerHTML = ''; // Clear existing content
inputElement.textContent = text; // Set the new text content
inputElement.dispatchEvent(new Event('input', { bubbles: true, composed: true }));
console.log(`[${this.name}] Set text content and dispatched input event for contentEditable div.`);
} else { // Standard input or textarea
console.log(`[${this.name}] Input field is textarea/input.`);
inputElement.value = text;
inputElement.dispatchEvent(new Event('input', { bubbles: true }));
inputElement.focus();
console.log(`[${this.name}] Set value and dispatched input event for textarea/input.`);
}
await new Promise(resolve => setTimeout(resolve, 500)); // Preserved delay
const isDisabled = sendButton.disabled ||
sendButton.getAttribute('aria-disabled') === 'true' ||
sendButton.classList.contains('disabled');
if (!isDisabled) {
console.log(`[${this.name}] Clicking send button.`);
sendButton.click();
return true;
} else {
console.warn(`[${this.name}] Send button is disabled. Cannot send message.`);
return false;
}
} catch (error) {
console.error(`[${this.name}] Error sending message:`, error);
return false;
}
}
// Capture response from the chat interface (Original logic, logs updated for consistency)
captureResponse(element) {
if (!element) {
console.log(`[${this.name}] No element provided to captureResponse`);
return { found: false, text: '' };
}
console.log(`[${this.name}] Attempting to capture response from Gemini:`, element);
let responseText = "";
let foundResponse = false;
try {
console.log(`[${this.name}] Looking for response in various elements...`);
if (element.textContent) {
console.log(`[${this.name}] Element has text content`);
responseText = element.textContent.trim();
if (responseText &&
responseText !== this.lastSentMessage &&
!responseText.includes("Loading") &&
!responseText.includes("Thinking") &&
!responseText.includes("You stopped this response")) {
console.log(`[${this.name}] Found response in element:`, responseText.substring(0, 50) + (responseText.length > 50 ? "..." : ""));
foundResponse = true;
} else {
console.log(`[${this.name}] Element text appears to be invalid:`, responseText.substring(0, 50) + (responseText.length > 50 ? "..." : ""));
}
} else {
console.log(`[${this.name}] Element has no text content`);
}
console.log(`[${this.name}] Trying to find the most recent conversation container...`);
const conversationContainers = document.querySelectorAll('.conversation-container');
if (conversationContainers && conversationContainers.length > 0) {
console.log(`[${this.name}] Found ${conversationContainers.length} conversation containers`);
const lastContainer = conversationContainers[conversationContainers.length - 1];
console.log(`[${this.name}] Last container ID:`, lastContainer.id);
const userQuery = lastContainer.querySelector('.user-query-container');
const userText = userQuery ? userQuery.textContent.trim() : '';
if (userText === this.lastSentMessage) {
console.log(`[${this.name}] Found container with our last sent message, looking for response`);
}
const modelResponse = lastContainer.querySelector('model-response');
if (modelResponse) {
console.log(`[${this.name}] Found model-response in last conversation container`);
const messageContent = modelResponse.querySelector('message-content.model-response-text');
if (messageContent) {
console.log(`[${this.name}] Found message-content in model-response`);
const markdownDiv = messageContent.querySelector('.markdown');
if (markdownDiv) {
console.log(`[${this.name}] Found markdown div in message-content`);
const text = markdownDiv.textContent.trim();
if (text &&
text !== this.lastSentMessage &&
!text.includes("Loading") &&
!text.includes("Thinking") &&
!text.includes("You stopped this response")) {
responseText = text;
console.log(`[${this.name}] Found response in markdown div:`, responseText.substring(0, 50) + (responseText.length > 50 ? "..." : ""));
foundResponse = true;
}
}
}
}
} else {
console.log(`[${this.name}] No conversation containers found`);
}
if (!foundResponse) {
console.log(`[${this.name}] Trying to find model-response-message-content elements...`);
const responseMessages = document.querySelectorAll('div[id^="model-response-message-content"]');
if (responseMessages && responseMessages.length > 0) {
console.log(`[${this.name}] Found ${responseMessages.length} model-response-message-content elements`);
const sortedMessages = Array.from(responseMessages).sort((a, b) => {
return a.id.localeCompare(b.id);
});
const responseMessage = sortedMessages[sortedMessages.length - 1];
console.log(`[${this.name}] Last response message ID:`, responseMessage.id);
const text = responseMessage.textContent.trim();
if (text &&
text !== this.lastSentMessage &&
!text.includes("Loading") &&
!text.includes("Thinking") &&
!text.includes("You stopped this response")) {
responseText = text;
console.log(`[${this.name}] Found response in model-response-message:`, responseText.substring(0, 50) + (responseText.length > 50 ? "..." : ""));
foundResponse = true;
}
} else {
console.log(`[${this.name}] No model-response-message-content elements found`);
}
}
if (!foundResponse) {
console.log(`[${this.name}] Trying to find message-content elements...`);
const messageContents = document.querySelectorAll('message-content.model-response-text');
if (messageContents && messageContents.length > 0) {
console.log(`[${this.name}] Found ${messageContents.length} message-content elements`);
const sortedContents = Array.from(messageContents).sort((a, b) => {
return (a.id || '').localeCompare(b.id || '');
});
const lastMessageContent = sortedContents[sortedContents.length - 1];
console.log(`[${this.name}] Last message content ID:`, lastMessageContent.id || 'no-id');
const markdownDiv = lastMessageContent.querySelector('.markdown');
if (markdownDiv) {
console.log(`[${this.name}] Found markdown div`);
const paragraphs = markdownDiv.querySelectorAll('p');
if (paragraphs && paragraphs.length > 0) {
console.log(`[${this.name}] Found ${paragraphs.length} paragraphs in markdown div`);
let combinedText = "";
paragraphs.forEach((p, index) => {
const text = p.textContent.trim();
console.log(`[${this.name}] Paragraph ${index} text:`, text.substring(0, 30) + (text.length > 30 ? "..." : ""));
if (text &&
text !== this.lastSentMessage &&
!text.includes("Loading") &&
!text.includes("Thinking") &&
!text.includes("You stopped this response")) {
combinedText += text + "\n";
}
});
if (combinedText.trim()) {
responseText = combinedText.trim();
console.log(`[${this.name}] Found response in paragraphs:`, responseText.substring(0, 50) + (responseText.length > 50 ? "..." : ""));
foundResponse = true;
} else {
console.log(`[${this.name}] No valid text found in paragraphs`);
}
} else {
console.log(`[${this.name}] No paragraphs found in markdown div`);
const text = markdownDiv.textContent.trim();
if (text &&
text !== this.lastSentMessage &&
!text.includes("Loading") &&
!text.includes("Thinking") &&
!text.includes("You stopped this response")) {
responseText = text;
console.log(`[${this.name}] Found response in markdown div:`, responseText.substring(0, 50) + (responseText.length > 50 ? "..." : ""));
foundResponse = true;
} else {
console.log(`[${this.name}] Markdown div text appears to be invalid:`, text.substring(0, 50) + (text.length > 50 ? "..." : ""));
}
}
} else {
console.log(`[${this.name}] No markdown div found in message-content`);
const text = lastMessageContent.textContent.trim();
if (text &&
text !== this.lastSentMessage &&
!text.includes("Loading") &&
!text.includes("Thinking") &&
!text.includes("You stopped this response")) {
responseText = text;
console.log(`[${this.name}] Found response in message-content:`, responseText.substring(0, 50) + (responseText.length > 50 ? "..." : ""));
foundResponse = true;
} else {
console.log(`[${this.name}] Message-content text appears to be invalid:`, text.substring(0, 50) + (text.length > 50 ? "..." : ""));
}
}
} else {
console.log(`[${this.name}] No message-content elements found`);
}
}
if (!foundResponse) {
console.log(`[${this.name}] Trying to find paragraphs in the document...`);
const paragraphs = document.querySelectorAll('p');
if (paragraphs && paragraphs.length > 0) {
console.log(`[${this.name}] Found ${paragraphs.length} paragraphs`);
let combinedText = "";
for (let i = paragraphs.length - 1; i >= 0; i--) {
const paragraph = paragraphs[i];
const text = paragraph.textContent.trim();
const isUserQuery = paragraph.closest('.user-query-container, .user-query-bubble-container');
if (isUserQuery) {
continue;
}
if (text &&
text !== this.lastSentMessage &&
!text.includes("Loading") &&
!text.includes("Thinking") &&
!text.includes("You stopped this response")) {
combinedText = text + "\n" + combinedText;
if (text.startsWith("Hello") || text.includes("I'm doing") || text.includes("How can I assist")) {
break;
}
}
}
if (combinedText.trim()) {
responseText = combinedText.trim();
console.log(`[${this.name}] Found response in paragraphs:`, responseText.substring(0, 50) + (responseText.length > 50 ? "..." : ""));
foundResponse = true;
} else {
console.log(`[${this.name}] No valid text found in paragraphs`);
}
} else {
console.log(`[${this.name}] No paragraphs found`);
}
}
if (!foundResponse) {
console.log(`[${this.name}] Response not found yet, will try again in the next polling cycle`);
}
} catch (error) {
console.error(`[${this.name}] Error capturing response from Gemini:`, error);
}
if (foundResponse && responseText) {
console.log(`[${this.name}] Cleaning up response text...`);
responseText = responseText.trim()
.replace(/^(Loading|Thinking).*/gim, '')
.replace(/You stopped this response.*/gim, '')
.replace(/\n{3,}/g, '\n\n')
.trim();
console.log(`[${this.name}] Cleaned response text:`, responseText.substring(0, 50) + (responseText.length > 50 ? "..." : ""));
}
return {
found: foundResponse && !!responseText.trim(),
text: responseText
};
}
// (NEW) Method for streaming API patterns
getStreamingApiPatterns() {
console.log(`[${this.name}] getStreamingApiPatterns called`);
// TODO: DEVELOPER ACTION REQUIRED!
// Use browser Network DevTools on gemini.google.com to identify the
// exact URL(s) that deliver the AI's streaming response when a prompt is sent.
// Replace the placeholder pattern below with the correct one(s).
// Example: return [{ urlPattern: "*://gemini.google.com/api/generate*", requestStage: "Response" }];
return [
{ urlPattern: "*://gemini.google.com/api/stream/generateContent*", requestStage: "Response" } // Placeholder - VERIFY THIS!
];
}
// (NEW) Optional Fallback Methods
async captureResponseDOMFallback() {
console.log(`[${this.name}] captureResponseDOMFallback called. Implement DOM observation logic here if needed as a fallback.`);
// TODO: Implement or verify existing DOM fallback logic for Gemini if it's to be kept.
// This method would typically use this.responseSelectorForDOMFallback
// For example:
// const responseElements = document.querySelectorAll(this.responseSelectorForDOMFallback);
// if (responseElements.length > 0) {
// const lastResponse = responseElements[responseElements.length - 1];
// return lastResponse.textContent.trim();
// }
return "Response from DOM fallback (GeminiProvider)"; // Placeholder
}
isResponseStillGeneratingForDOM() {
console.log(`[${this.name}] isResponseStillGeneratingForDOM called. Implement DOM check here.`);
// TODO: Implement or verify existing DOM check for thinking indicator for Gemini.
// This method would typically use this.thinkingIndicatorSelectorForDOM
// const thinkingIndicator = document.querySelector(this.thinkingIndicatorSelectorForDOM);
// return thinkingIndicator && thinkingIndicator.offsetParent !== null; // Check if visible
return false; // Placeholder
}
// Find a response element in a container (Original logic, logs updated for consistency)
findResponseElement(container) {
console.log(`[${this.name}] Finding response element in container:`, container);
if (container.id && container.id.startsWith("model-response-message-content")) {
console.log(`[${this.name}] Container is a model-response-message-content element`);
return container;
}
if (container.querySelector) {
const modelResponseMessage = container.querySelector('div[id^="model-response-message-content"]');
if (modelResponseMessage) {
console.log(`[${this.name}] Found model-response-message-content element in container`);
return modelResponseMessage;
}
}
if (container.classList && container.classList.contains('conversation-container')) {
console.log(`[${this.name}] Container is a conversation container`);
const modelResponse = container.querySelector('model-response');
if (modelResponse) {
console.log(`[${this.name}] Found model-response in conversation container`);
const messageContent = modelResponse.querySelector('message-content.model-response-text');
if (messageContent) {
console.log(`[${this.name}] Found message-content in model-response`);
const markdownDiv = messageContent.querySelector('.markdown');
if (markdownDiv) {
console.log(`[${this.name}] Found markdown div in message-content`);
return markdownDiv;
}
return messageContent;
}
return modelResponse;
}
}
if (container.matches && container.matches(this.responseSelector)) {
console.log(`[${this.name}] Container itself is a response element`);
return container;
}
if (container.querySelector) {
const responseElement = container.querySelector(this.responseSelector);
if (responseElement) {
console.log(`[${this.name}] Found response element in container`);
return responseElement;
}
}
if (container.tagName && container.tagName.toLowerCase() === 'p') {
console.log(`[${this.name}] Container is a paragraph`);
const isUserQuery = container.closest('.user-query-container, .user-query-bubble-container');
if (!isUserQuery) {
console.log(`[${this.name}] Paragraph is not inside a user-query, returning it`);
return container;
} else {
console.log(`[${this.name}] Paragraph is inside a user-query`);
}
}
console.log(`[${this.name}] No response element found in container`);
return null;
}
// Check if we should skip response monitoring (Original - UNCHANGED)
shouldSkipResponseMonitoring() {
// We want to monitor for responses now that we've fixed the response capturing
return false;
}
}
// Ensure this runs after the class definition (NEW REGISTRATION)
(function() {
if (window.providerUtils) {
const providerInstance = new GeminiProvider();
window.providerUtils.registerProvider(providerInstance.name, providerInstance.supportedDomains, providerInstance);
} else {
console.error("ProviderUtils not found. GeminiProvider cannot be registered.");
}
})();

View file

@ -0,0 +1,59 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
// AI Chat Relay - Provider Index
// Map of provider IDs to provider instances
const providers = {
'gemini': window.geminiProvider,
'aistudio': window.aiStudioProvider,
'chatgpt': window.chatgptProvider,
'claude': window.claudeProvider
};
// Get a provider by ID
function getProvider(id) {
return providers[id] || null;
}
// Get a provider based on the current URL
function detectProvider(url) {
if (url.includes('gemini.google.com')) {
return providers.gemini;
} else if (url.includes('aistudio.google.com')) {
return providers.aistudio;
} else if (url.includes('chatgpt.com')) {
return providers.chatgpt;
} else if (url.includes('claude.ai')) {
return providers.claude;
}
// Default to aistudio if we can't detect
return providers.aistudio;
}
// Get all supported domains
function getSupportedDomains() {
return Object.values(providers).flatMap(provider => provider.supportedDomains);
}
// Make functions available globally
window.providerUtils = {
getProvider,
detectProvider,
getSupportedDomains
};

View file

@ -0,0 +1,62 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
// AI Chat Relay - Provider Utilities
// Map of supported domains to provider instances
const providerMap = {}; // Stores { domain: { name: providerName, instance: providerObject } }
// Register a provider with its supported domains and instance
function registerProvider(providerName, domains, providerInstance) {
if (!providerName || !Array.isArray(domains) || !providerInstance) {
console.error("PROVIDER-UTILS: Invalid arguments for registerProvider.", { providerName, domains, providerInstance });
return;
}
domains.forEach(domain => {
providerMap[domain] = { name: providerName, instance: providerInstance };
});
console.log("PROVIDER-UTILS: Registered provider:", providerName, "for domains:", domains);
}
// Detect the provider for the current page
function detectProvider(hostname) {
// Ensure hostname is a string before calling .includes()
if (typeof hostname !== 'string') {
console.warn("PROVIDER-UTILS: Invalid hostname for detectProvider:", hostname);
return null;
}
console.log("PROVIDER-UTILS: Detecting provider for hostname:", hostname);
console.log("PROVIDER-UTILS: Current providerMap:", JSON.stringify(providerMap)); // Log current map for debugging
for (const domainKey in providerMap) {
if (hostname.includes(domainKey)) {
const providerData = providerMap[domainKey];
console.log("PROVIDER-UTILS: Found provider", providerData.name, "for hostname", hostname);
return providerData.instance;
}
}
console.log("PROVIDER-UTILS: No provider found for hostname:", hostname);
return null;
}
// Export the functions
window.providerUtils = {
detectProvider,
registerProvider // Expose registerProvider
};

View file

@ -0,0 +1,430 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
// AI Chat Relay - Generic Provider Template
// This is a template. You need to customize it for the specific website you want to support.
class GenericProvider {
constructor() {
// --- START OF CONFIGURABLE PROPERTIES ---
// **TODO: CONFIGURE THESE PROPERTIES FOR YOUR TARGET WEBSITE**
// Method for response capture: "debugger" or "dom"
// "debugger": Intercepts network requests. Requires `debuggerUrlPattern`.
// "dom": Observes changes in the webpage's Document Object Model.
this.captureMethod = "dom"; // or "debugger"
// URL pattern for debugger to intercept if captureMethod is "debugger".
// Make this pattern as specific as possible to avoid intercepting unrelated requests.
// Example: "*api.example.com/chat/stream*"
this.debuggerUrlPattern = "*your-api-endpoint-pattern*"; // VERIFY THIS PATTERN IF USING DEBUGGER
// Whether to include "thinking" or intermediary process steps in the message,
// or just the final answer.
// If true, parseDebuggerResponse (if used) should aim to return a JSON string:
// { "thinking": "...", "answer": "..." }
// If false, it should return a string: "answer"
this.includeThinkingInMessage = false;
// --- END OF CONFIGURABLE PROPERTIES ---
// **TODO: UPDATE THE PROVIDER NAME AND SUPPORTED DOMAINS**
this.name = "GenericProvider"; // e.g., "MyCustomChatProvider"
// List of domains this provider will activate on.
// Example: ["chat.example.com", "another.example.net"]
this.supportedDomains = ["example.com"]; // Replace with actual domains
// **TODO: UPDATE SELECTORS FOR YOUR TARGET WEBSITE'S HTML STRUCTURE**
// CSS selector for the main chat input text area.
this.inputSelector = 'textarea[placeholder="Send a message"]'; // Adjust to match the site
// CSS selector for the send button.
this.sendButtonSelector = 'button[aria-label="Send"]'; // Adjust to match the site
// CSS selector for identifying response messages or containers.
// This is crucial for DOM capture and can be complex.
this.responseSelector = '.message-bubble .text-content'; // Adjust to match the site
// CSS selector for an element indicating the AI is "thinking" or generating a response.
this.thinkingIndicatorSelector = '.loading-spinner'; // Adjust to match the site
// Fallback selectors for DOM capture method (if primary ones are too broad or miss things)
// These are often similar to responseSelector but might be more specific or broader.
this.responseSelectorForDOMFallback = '.message-container .response-text'; // Adjust as needed
this.thinkingIndicatorSelectorForDOM = '.thinking-dots, .spinner-animation'; // Adjust as needed
// Stores the last message sent by the user to avoid capturing it as an AI response.
this.lastSentMessage = '';
// Manages callbacks for pending responses, mapping request IDs to callback functions.
this.pendingResponseCallbacks = new Map();
// Timer for DOM monitoring (if captureMethod is "dom")
this.domMonitorTimer = null;
// You might have initialization logic here, e.g., checking for specific site features
// or setting up initial event listeners if absolutely necessary (though most are handled by the core).
console.log(`[${this.name}] Provider initialized for domains: ${this.supportedDomains.join(', ')}`);
}
// Sends a message to the chat interface.
// text: The message string to send.
async sendChatMessage(text) {
console.log(`[${this.name}] sendChatMessage called with:`, text);
const inputField = document.querySelector(this.inputSelector);
const sendButton = document.querySelector(this.sendButtonSelector);
if (!inputField) {
console.error(`[${this.name}] Input field not found with selector: ${this.inputSelector}`);
return false;
}
if (!sendButton) {
console.error(`[${this.name}] Send button not found with selector: ${this.sendButtonSelector}`);
// Attempt to proceed if input field is found, maybe user hits enter.
// But ideally, both should be found.
}
console.log(`[${this.name}] Attempting to send message to target site with:`, {
inputFieldFound: !!inputField,
sendButtonFound: !!sendButton
});
try {
this.lastSentMessage = text;
console.log(`[${this.name}] Stored last sent message:`, this.lastSentMessage);
// Simulate user input
inputField.value = text;
inputField.dispatchEvent(new Event('input', { bubbles: true, cancelable: true }));
inputField.focus();
// Wait a bit for the site's JavaScript to process the input (e.g., enable the send button)
await new Promise(resolve => setTimeout(resolve, 100));
if (sendButton) {
const isDisabled = sendButton.disabled ||
sendButton.getAttribute('aria-disabled') === 'true' ||
sendButton.classList.contains('disabled'); // Common ways to disable buttons
if (!isDisabled) {
console.log(`[${this.name}] Clicking send button.`);
sendButton.click();
} else {
console.warn(`[${this.name}] Send button is disabled. Attempting to submit differently (e.g., form submission or Enter key press).`);
// Fallback: Try to dispatch a 'submit' event on the form if applicable,
// or simulate an Enter key press on the input field.
// This part is highly site-specific.
// Example: inputField.dispatchEvent(new KeyboardEvent('keydown', { key: 'Enter', code: 'Enter', bubbles: true, cancelable: true }));
// For now, we'll just log and assume the user might press Enter manually if button click fails.
if (inputField.form) {
// inputField.form.requestSubmit(); // Modern way
// or inputField.form.submit(); // Older way, might cause full page reload
}
}
} else {
// If no send button, perhaps the site relies on Enter key.
// Consider simulating Enter press here if appropriate for the target site.
console.log(`[${this.name}] Send button not found. User might need to press Enter or an alternative send mechanism.`);
}
return true;
} catch (error) {
console.error(`[${this.name}] Error sending message to target site:`, error);
return false;
}
}
// Initiates response capture for a given request.
// requestId: A unique ID for the chat request.
// responseCallback: Function to call with the (requestId, messageText, isFinal)
initiateResponseCapture(requestId, responseCallback) {
console.log(`[${this.name}] initiateResponseCapture called for requestId: ${requestId}. Capture method: ${this.captureMethod}`);
this.pendingResponseCallbacks.set(requestId, responseCallback);
if (this.captureMethod === "debugger") {
console.log(`[${this.name}] Debugger capture selected. Callback stored for requestId: ${requestId}. Ensure background script is set up for '${this.debuggerUrlPattern}'.`);
// The actual debugger attachment and data forwarding is handled by the background script.
// This provider just needs to be ready to process `handleDebuggerData`.
} else if (this.captureMethod === "dom") {
console.log(`[${this.name}] DOM capture selected. Starting DOM monitoring for requestId: ${requestId}`);
this._stopDOMMonitoring(); // Ensure no old monitors are running
this._startDOMMonitoring(requestId);
} else {
console.error(`[${this.name}] Unknown capture method: ${this.captureMethod}`);
responseCallback(requestId, `[Error: Unknown capture method '${this.captureMethod}' in provider]`, true);
this.pendingResponseCallbacks.delete(requestId);
}
}
// Handles data received from the debugger (via background script).
// requestId: The unique ID for the chat request.
// rawData: The raw data string from the intercepted network response.
// isFinalFromBackground: Boolean indicating if the background script considers this the final chunk.
handleDebuggerData(requestId, rawData, isFinalFromBackground) {
console.log(`[${this.name}] handleDebuggerData called for requestId: ${requestId}. Raw data length: ${rawData ? rawData.length : 'null'}. isFinalFromBackground: ${isFinalFromBackground}`);
const callback = this.pendingResponseCallbacks.get(requestId);
if (!callback) {
console.warn(`[${this.name}] No pending callback found for debugger data with requestId: ${requestId}. Ignoring.`);
return;
}
let parsedText = "";
let isFinalChunkAccordingToParser = false;
if (rawData && rawData.trim() !== "") {
// **TODO: IMPLEMENT CUSTOM PARSING LOGIC FOR YOUR DEBUGGER DATA**
// This function needs to extract the actual chat message from `rawData`.
// It might involve parsing JSON, Server-Sent Events (SSE), or other formats.
const parseOutput = this.parseDebuggerResponse(rawData);
parsedText = parseOutput.text;
isFinalChunkAccordingToParser = parseOutput.isFinalResponse;
console.log(`[${this.name}] Debugger data parsed for requestId: ${requestId}. Parsed text (first 100 chars): '${(parsedText || "").substring(0,100)}'. Parser says final: ${isFinalChunkAccordingToParser}`);
} else {
console.log(`[${this.name}] Received empty or null rawData from debugger for requestId: ${requestId}. isFinalFromBackground: ${isFinalFromBackground}`);
}
// The overall response is final if the background script says so,
// OR if the parser itself determines this chunk is the end.
const isFinalForCallback = isFinalFromBackground || isFinalChunkAccordingToParser;
console.log(`[${this.name}] Calling callback for requestId ${requestId} with text (first 100): '${(parsedText || "").substring(0,100)}', isFinalForCallback: ${isFinalForCallback}`);
callback(requestId, parsedText, isFinalForCallback);
if (isFinalForCallback) {
console.log(`[${this.name}] Final event processed for requestId: ${requestId}. Removing callback.`);
this.pendingResponseCallbacks.delete(requestId);
}
}
// **TODO: CUSTOMIZE THIS METHOD IF USING DEBUGGER CAPTURE**
// Parses the raw response from the debugger.
// jsonString: The raw data string (often JSON, but can be anything).
// Returns an object: { text: "extracted message", isFinalResponse: boolean }
parseDebuggerResponse(rawDataString) {
console.log(`[${this.name}] Parsing debugger response. Input (first 200 chars):`, rawDataString ? rawDataString.substring(0,200) : "null");
// --- GENERIC EXAMPLE: ASSUME SIMPLE TEXT OR JSON ---
// This is a placeholder. You MUST adapt this to the actual data format.
let extractedText = "";
let isFinal = false; // Assume not final unless data indicates otherwise
if (!rawDataString || rawDataString.trim() === "") {
return { text: "", isFinalResponse: true }; // Empty response is considered final
}
try {
// Attempt to parse as JSON (common for APIs)
const jsonData = JSON.parse(rawDataString);
// **TODO: Adapt JSON parsing to your specific API response structure**
// Example: data might be in jsonData.choices[0].text or jsonData.message
if (jsonData.message) {
extractedText = jsonData.message;
} else if (jsonData.text) {
extractedText = jsonData.text;
} else if (Array.isArray(jsonData) && jsonData.length > 0 && typeof jsonData[0] === 'string') {
extractedText = jsonData.join("\\n"); // If it's an array of strings
} else {
// Fallback: stringify if structure is unknown but valid JSON
extractedText = JSON.stringify(jsonData);
}
// Example: Check for a done flag
if (typeof jsonData.done === 'boolean') {
isFinal = jsonData.done;
} else {
// If no explicit done flag, assume a single JSON object is a complete, final response.
isFinal = true;
}
} catch (e) {
// If not JSON, treat as plain text.
// This could also be Server-Sent Events (SSE), which need line-by-line parsing.
// Example for SSE:
// if (rawDataString.startsWith("data:")) {
// extractedText = rawDataString.substring(5).trim();
// if (extractedText === "[DONE]") {
// extractedText = ""; // Or some indicator of completion
// isFinal = true;
// }
// } else {
// extractedText = rawDataString;
// }
// For now, just use the raw string as text
extractedText = rawDataString;
isFinal = true; // Assume plain text is a complete response unless part of a stream
}
// If `includeThinkingInMessage` is true, you might structure `extractedText` as a JSON string:
// { "thinking": "...", "answer": "..." }
// For this generic template, we'll assume simple text.
const formattedOutput = this.formatOutput("", extractedText); // No separate thinking text for this basic parser
// Basic guard against returning only empty strings if the marker says final.
if (formattedOutput.trim() === "" && isFinal) {
return { text: "", isFinalResponse: true };
}
return { text: formattedOutput, isFinalResponse: isFinal };
// --- END OF GENERIC EXAMPLE ---
}
// Formats the output string, potentially including thinking text if configured.
formatOutput(thinkingText, answerText) {
if (this.includeThinkingInMessage && thinkingText && thinkingText.trim() !== "") {
try {
const result = {
thinking: thinkingText.trim(),
answer: (answerText || "").trim()
};
return JSON.stringify(result);
} catch (e) {
console.error(`[${this.name}] Error stringifying thinking/answer object:`, e);
return (answerText || "").trim(); // Fallback to just answer
}
}
return (answerText || "").trim(); // Default: just the answer
}
// --- DOM CAPTURE METHODS ---
// Captures the response from the DOM.
// element: Optional. A specific element to check. If null, queries using `responseSelector`.
_captureResponseDOM(element = null) {
// console.log(`[${this.name}] _captureResponseDOM (DOM method) called with element:`, element);
if (!element && this.captureMethod === "dom") {
const elements = document.querySelectorAll(this.responseSelector);
if (elements.length > 0) {
// **TODO: Determine which element is the LATEST response.**
// This usually means the last one in document order.
element = elements[elements.length - 1];
// console.log(`[${this.name}] _captureResponseDOM: Found element via querySelectorAll:`, element);
}
}
if (!element) {
// console.log(`[${this.name}] _captureResponseDOM: No element provided or found by primary selector.`);
// Try fallback selector
const fallbackElements = document.querySelectorAll(this.responseSelectorForDOMFallback);
if (fallbackElements.length > 0) {
element = fallbackElements[fallbackElements.length - 1];
// console.log(`[${this.name}] _captureResponseDOM: Found element via fallback selector:`, element);
}
}
if (!element) {
// console.log(`[${this.name}] _captureResponseDOM: No response element found by any selector.`);
return { found: false, text: '' };
}
// Check if the AI is still "thinking" (e.g., spinner is visible)
if (this._isResponseStillGeneratingDOM()) {
// console.log(`[${this.name}] Response is still being generated (thinking indicator found), waiting.`);
return { found: false, text: '' };
}
let responseText = "";
let foundResponse = false;
try {
// **TODO: CUSTOMIZE TEXT EXTRACTION FROM THE RESPONSE ELEMENT**
// This logic needs to reliably get the text content from `element`.
// It might involve getting `textContent`, `innerText`, or iterating child nodes.
// Consider cases like code blocks, multiple paragraphs, etc.
if (element.textContent) {
let potentialText = element.textContent.trim();
// Basic filter: ignore if it's the user's last sent message or common loading/placeholder text.
// This filtering can be made more robust.
if (potentialText &&
potentialText !== this.lastSentMessage &&
!potentialText.toLowerCase().includes("loading") &&
!potentialText.toLowerCase().includes("generating") &&
!potentialText.toLowerCase().includes("thinking")) {
responseText = potentialText;
foundResponse = true;
// console.log(`[${this.name}] Found response in element:`, responseText.substring(0, 100));
} else {
// console.log(`[${this.name}] Element text is likely noise or self-echo:`, potentialText.substring(0, 100));
}
} else {
// console.log(`[${this.name}] Element has no text content.`);
}
// Add more sophisticated extraction if needed (e.g., combining text from multiple child elements)
} catch (error) {
console.error(`[${this.name}] Error capturing response from DOM element:`, error, "Element:", element);
}
if (foundResponse && responseText) {
// Basic cleanup
responseText = responseText.trim()
.replace(/\\n{3,}/g, '\\n\\n') // Condense multiple newlines
.trim();
}
return {
found: foundResponse && !!responseText.trim(),
text: responseText
};
}
// Checks if the AI is still generating a response (for DOM method).
_isResponseStillGeneratingDOM() {
// **TODO: REFINE THIS LOGIC FOR YOUR TARGET SITE**
// This checks for thinking indicators using `thinkingIndicatorSelector` or `thinkingIndicatorSelectorForDOM`.
let thinkingIndicator = document.querySelector(this.thinkingIndicatorSelector);
if (!thinkingIndicator) {
thinkingIndicator = document.querySelector(this.thinkingIndicatorSelectorForDOM);
}
if (thinkingIndicator) {
// Check if the indicator is visible (important for elements that are hidden/shown)
const style = window.getComputedStyle(thinkingIndicator);
if (style.display !== 'none' && style.visibility !== 'hidden' && parseFloat(style.opacity) > 0) {
// console.log(`[${this.name}] DOM: Thinking indicator found and visible.`);
return true;
}
}
// console.log(`[${this.name}] DOM: No (visible) thinking indicator found.`);
return false;
}
// Starts polling the DOM for responses.
_startDOMMonitoring(requestId) {
console.log(`[${this.name}] DOM: _startDOMMonitoring for requestId: ${requestId}`);
this._stopDOMMonitoring(); // Clear any existing timer
const callback = this.pendingResponseCallbacks.get(requestId);
if (!callback) {
console.error(`[${this.name}] DOM: No callback for requestId ${requestId} in _startDOMMonitoring.`);
return;
}
let attempts = 0;
const maxAttempts = 30; // Try for ~30 seconds (30 * 1000ms)
const interval = 1000; // Poll every 1 second
this.domMonitorTimer = setInterval(() => {
// console.log(`[${this.name}] DOM: Polling attempt ${attempts + 1}/${maxAttempts} for requestId: ${requestId}`);
const responseData = this._captureResponseDOM();
if (responseData.found && responseData.text.trim() !== "") {
console.log(`[${this.name}] DOM: Response captured for requestId ${requestId}. Text (first 100): ${responseData.text.substring(0,100)}`);
this._stopDOMMonitoring();
// For DOM, we typically assume a captured response is final once the thinking indicator is gone.
// More complex sites might require observing mutations to detect streaming.
callback(requestId, responseData.text, true);
this.pendingResponseCallbacks.delete(requestId);
} else {
attempts++;
if (attempts >= maxAttempts) {
console.warn(`[${this.name}] DOM: Max attempts reached for requestId ${requestId}. No complete response captured or thinking indicator persisted.`);
this._stopDOMMonitoring();
// If still no response, check one last time without waiting for thinking indicator.
const lastAttemptData = this._captureResponseDOM();
if (lastAttemptData.found && lastAttemptData.text.trim() !== "") {
callback(requestId, lastAttemptData.text, true);
} else {
callback(requestId, "[Error: Timed out waiting for DOM response or response remained empty]", true);
}
this.pendingResponseCallbacks.delete(requestId);
}
}
}, interval);
console.log(`[${this.name}] DOM: Monitoring started with timer ID ${this.domMonitorTimer} for request ${requestId}.`);
}
// Stops the DOM polling timer.
_stopDOMMonitoring() {
if (this.domMonitorTimer) {
// console.log(`[${this.name}] DOM: Stopping DOM monitoring timer ID ${this.domMonitorTimer}`);
clearInterval(this.domMonitorTimer);
this.domMonitorTimer = null;
}
}
// --- UTILITY METHODS ---
// Determines if response monitoring should be skipped.
// (Primarily for background script to decide if it should attach debuggers)
shouldSkipResponseMonitoring() {
// If using debugger, background script handles it. If DOM, content script handles it.
// console.log(`[${this.name}] shouldSkipResponseMonitoring called. Capture method: ${this.captureMethod}`);
return this.captureMethod === "debugger";
}
// Returns URL patterns for the debugger to intercept (if captureMethod is "debugger").
getStreamingApiPatterns() {
console.log(`[${this.name}] getStreamingApiPatterns called. Capture method: ${this.captureMethod}`);
if (this.captureMethod === "debugger" && this.debuggerUrlPattern && this.debuggerUrlPattern !== "*your-api-endpoint-pattern*") {
console.log(`[${this.name}] Using debugger URL pattern: ${this.debuggerUrlPattern}`);
// Background script expects an array of objects with { urlPattern, requestStage }
return [{ urlPattern: this.debuggerUrlPattern, requestStage: "Response" }];
}
console.log(`[${this.name}] No debugger patterns to return (captureMethod is not 'debugger' or pattern is default/empty).`);
return []; // Return empty array if not using debugger or pattern not set
}
}
// Ensure the provider is available on the window for the content script (main.js)
// This registration pattern allows the core extension to find and use this provider.
if (window.providerUtils) {
const providerInstance = new GenericProvider();
window.providerUtils.registerProvider(
providerInstance.name,
providerInstance.supportedDomains,
providerInstance // The instance of this provider class
);
console.log(`[${providerInstance.name}] Provider registered with providerUtils.`);
} else {
console.error("GenericProvider: providerUtils not found on window. Registration failed. Ensure main.js (content script) loads first or provides providerUtils.");
}

BIN
images/using-mcp.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 304 KiB

BIN
images/using-roo-cline.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 256 KiB

95
mcp-server/README.md Normal file
View file

@ -0,0 +1,95 @@
# Chat Relay MCP
A system that allows AI assistants to interact with web-based chat applications through WebSockets.
## Overview
This project consists of two main components:
1. **MCP Server**: A Node.js server that provides tools and resources for AI assistants to send and receive WebSocket messages
2. **Browser Extension**: A Chrome/Edge extension that intercepts WebSocket communications on web pages and relays them to the MCP server
## Architecture
```
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ │ │ │ │ │
│ Web Page │◄────────│ Browser │◄────────│ MCP Server │◄────┐
│ WebSocket │ │ Extension │ │ WebSocket │ │
│ │ │ │ │ Bridge │ │
└─────────────┘ └─────────────┘ └─────────────┘ │
▲ │
│ │
▼ │
┌─────────────┐ │
│ │ │
│ MCP Tools │ │
│ Resources │ │
│ │ │
└─────────────┘ │
▲ │
│ │
▼ │
┌─────────────┐ │
│ │ │
│ AI │─────┘
│ Assistant │
│ │
└─────────────┘
```
## Setup and Usage
### MCP Server
1. Install dependencies:
```
npm install
```
2. Build the server:
```
npm run build
```
3. Run the server:
```
node dist/index.js
```
The server will start with:
- MCP server using stdio transport for AI assistant communication
- WebSocket bridge server on port 8081 for extension communication
### Browser Extension
1. Navigate to `chrome://extensions` or `edge://extensions`
2. Enable "Developer mode"
3. Click "Load unpacked" and select the `extension` directory
4. The extension is now ready to use
### Using with AI Assistants
AI assistants can use the following MCP tools and resources:
- **send_websocket_message**: Sends a message to the web page's WebSocket
```
tool use chat-relay-mcp.send_websocket_message message="Your message here"
```
- **websocket_messages**: Retrieves messages received from the web page
```
tool use chat-relay-mcp.websocket_messages
```
## Development
- MCP server code is in the `src` directory
- Browser extension code is in the `extension` directory
- The WebSocket bridge runs on port 8081
## Security Considerations
- The browser extension has access to all WebSocket communications on websites you visit
- Only use the extension when needed and disable it when not in use
- The MCP server should only be run on trusted networks

Binary file not shown.

1743
mcp-server/package-lock.json generated Normal file

File diff suppressed because it is too large Load diff

31
mcp-server/package.json Normal file
View file

@ -0,0 +1,31 @@
{
"name": "chat-relay-mcp",
"version": "0.0.1",
"description": "chat-relay-mcp MCP server",
"type": "module",
"bin": {
"chat-relay-mcp": "./dist/index.js"
},
"files": [
"dist"
],
"scripts": {
"build": "tsc && mcp-build",
"watch": "tsc --watch",
"start": "node dist/index.js"
},
"dependencies": {
"mcp-framework": "^0.2.2",
"node-fetch": "^3.3.2",
"ws": "^8.18.2",
"zod": "^3.22.4"
},
"devDependencies": {
"@types/node": "^20.11.24",
"@types/ws": "^8.18.1",
"typescript": "^5.3.3"
},
"engines": {
"node": ">=18.19.0"
}
}

63
mcp-server/src/index.ts Normal file
View file

@ -0,0 +1,63 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
import { MCPServer } from "mcp-framework";
import SendMessageTool from "./tools/SendMessageTool.js";
import ReadFileTool from "./tools/ReadFileTool.js";
import WriteFileTool from "./tools/WriteFileTool.js";
import EditFileTool from "./tools/EditFileTool.js";
async function startServer() {
console.log("Initializing Chat Relay MCP Server...");
// Create instances of tools
const sendMessageTool = new SendMessageTool();
const readFileTool = new ReadFileTool();
const writeFileTool = new WriteFileTool();
const editFileTool = new EditFileTool();
// Create the MCP server with configuration
// Use 'as any' to bypass TypeScript type checking
const mcpServer = new MCPServer({
transport: { type: "stdio" }
} as any);
// Use 'as any' to bypass TypeScript type checking for method calls
try {
(mcpServer as any).registerTool(sendMessageTool);
(mcpServer as any).registerTool(readFileTool);
(mcpServer as any).registerTool(writeFileTool);
(mcpServer as any).registerTool(editFileTool);
console.log("Registered tools using 'registerTool' method");
} catch (error) {
console.error("Error registering tools:", error);
console.warn("Falling back to direct property assignment");
(mcpServer as any).tools = [sendMessageTool, readFileTool, writeFileTool, editFileTool];
}
try {
await mcpServer.start();
console.log(`MCP Server started with stdio transport. Registered operations: send_message, read_file, write_file, edit_file`);
console.log(`MCP Server is configured to communicate with API Relay Server on http://localhost:3003`);
} catch (error) {
console.error("Failed to start MCP Server:", error);
process.exit(1);
}
}
startServer().catch(err => {
console.error("Unhandled error during server startup:", err);
process.exit(1);
});

View file

@ -0,0 +1,105 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
import { MCPTool } from "mcp-framework";
import { z } from "zod";
import fs from "fs/promises";
import path from "path";
import fetch from "node-fetch";
interface EditFileInput {
path: string;
oldText: string;
newText: string;
}
class EditFileTool extends MCPTool<EditFileInput> {
name = "edit_file";
description = "Edits a file by replacing text after sending it through the API relay server for processing";
schema = {
path: {
type: z.string(),
description: "Path to the file to edit",
},
oldText: {
type: z.string(),
description: "Text to replace",
},
newText: {
type: z.string(),
description: "New text to insert",
},
};
async execute(input: EditFileInput) {
try {
// Read the file
const filePath = path.resolve(input.path);
const content = await fs.readFile(filePath, "utf-8");
// Check if the oldText exists in the file
if (!content.includes(input.oldText)) {
return `Error: The text to replace was not found in the file ${input.path}`;
}
// Send the edit to the API relay server for processing
const response = await fetch('http://localhost:3003/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: "chatgpt",
messages: [
{
role: "user",
content: `I want to edit a file at path ${input.path} by replacing:\n\n${input.oldText}\n\nWith:\n\n${input.newText}\n\nPlease review this edit and suggest any improvements or corrections.`
}
],
temperature: 0.7,
max_tokens: 100
})
});
if (!response.ok) {
const errorText = await response.text();
console.error(`MCP Tool: Error from API relay server: ${response.status} ${response.statusText}`, errorText);
return `Error from API relay server: ${response.status} ${response.statusText}`;
}
const data = await response.json();
console.log(`MCP Tool: Received response from API relay server:`, data);
// Extract the assistant's message from the response
const responseData = data as any; // Type assertion
const assistantMessage = responseData.choices[0].message.content;
// Perform the edit
const newContent = content.replace(input.oldText, input.newText);
await fs.writeFile(filePath, newContent, "utf-8");
// Return success message with the assistant's analysis
return `File successfully edited at ${input.path}\n\nAnalysis: ${assistantMessage}`;
} catch (error: any) {
console.error("MCP Tool: Error editing file or sending to API relay server:", error);
return `Error: ${error.message || 'Unknown error'}`;
}
}
}
export default EditFileTool;

View file

@ -0,0 +1,86 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
import { MCPTool } from "mcp-framework";
import { z } from "zod";
import fs from "fs/promises";
import path from "path";
import fetch from "node-fetch";
interface ReadFileInput {
path: string;
}
class ReadFileTool extends MCPTool<ReadFileInput> {
name = "read_file";
description = "Reads a file and sends its content through the API relay server for processing";
schema = {
path: {
type: z.string(),
description: "Path to the file to read",
},
};
async execute(input: ReadFileInput) {
try {
// Read the file
const filePath = path.resolve(input.path);
const content = await fs.readFile(filePath, "utf-8");
// Send the file content to the API relay server
const response = await fetch('http://localhost:3003/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: "chatgpt",
messages: [
{
role: "user",
content: `The following is the content of the file at path ${input.path}:\n\n${content}\n\nPlease analyze this file content and provide any insights or suggestions.`
}
],
temperature: 0.7,
max_tokens: 100
})
});
if (!response.ok) {
const errorText = await response.text();
console.error(`MCP Tool: Error from API relay server: ${response.status} ${response.statusText}`, errorText);
return `Error from API relay server: ${response.status} ${response.statusText}`;
}
const data = await response.json();
console.log(`MCP Tool: Received response from API relay server:`, data);
// Extract the assistant's message from the response
const responseData = data as any; // Type assertion
const assistantMessage = responseData.choices[0].message.content;
// Return both the file content and the analysis
return `File content:\n\n${content}\n\nAnalysis: ${assistantMessage}`;
} catch (error: any) {
console.error("MCP Tool: Error reading file or sending to API relay server:", error);
return `Error: ${error.message || 'Unknown error'}`;
}
}
}
export default ReadFileTool;

View file

@ -0,0 +1,78 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
import { MCPTool } from "mcp-framework";
import { z } from "zod";
import fetch from "node-fetch";
interface SendMessageInput {
message: string;
}
class SendMessageTool extends MCPTool<SendMessageInput> {
name = "send_message";
description = "Sends a message through the API relay server to the browser extension";
schema = {
message: {
type: z.string(),
description: "The message to send to the API relay server",
},
};
async execute(input: SendMessageInput) {
try {
// Send a POST request to the API relay server
const response = await fetch('http://localhost:3003/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: "chatgpt",
messages: [
{
role: "user",
content: input.message
}
],
temperature: 0.7,
max_tokens: 100
})
});
if (!response.ok) {
const errorText = await response.text();
console.error(`MCP Tool: Error from API relay server: ${response.status} ${response.statusText}`, errorText);
return `Error from API relay server: ${response.status} ${response.statusText}`;
}
const data = await response.json();
console.log(`MCP Tool: Received response from API relay server:`, data);
// Extract the assistant's message from the response
const responseData = data as any; // Type assertion
const assistantMessage = responseData.choices[0].message.content;
return `Message sent successfully. Response: "${assistantMessage}"`;
} catch (error: any) {
console.error("MCP Tool: Error sending message to API relay server:", error);
return `Error sending message to API relay server: ${error.message || 'Unknown error'}`;
}
}
}
export default SendMessageTool;

View file

@ -0,0 +1,92 @@
/*
* Chat Relay: Relay for AI Chat Interfaces
* Copyright (C) 2025 Jamison Moore
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero General Public License as
* published by the Free Software Foundation, either version 3 of the
* License, or (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU Affero General Public License for more details.
*
* You should have received a copy of the GNU Affero General Public License
* along with this program. If not, see https://www.gnu.org/licenses/.
*/
import { MCPTool } from "mcp-framework";
import { z } from "zod";
import fs from "fs/promises";
import path from "path";
import fetch from "node-fetch";
interface WriteFileInput {
path: string;
content: string;
}
class WriteFileTool extends MCPTool<WriteFileInput> {
name = "write_file";
description = "Writes content to a file after sending it through the API relay server for processing";
schema = {
path: {
type: z.string(),
description: "Path to the file to write",
},
content: {
type: z.string(),
description: "Content to write to the file",
},
};
async execute(input: WriteFileInput) {
try {
// Send the file content to the API relay server for processing
const response = await fetch('http://localhost:3003/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: "chatgpt",
messages: [
{
role: "user",
content: `I want to write the following content to a file at path ${input.path}:\n\n${input.content}\n\nPlease review this content and suggest any improvements or corrections before I write it to the file.`
}
],
temperature: 0.7,
max_tokens: 100
})
});
if (!response.ok) {
const errorText = await response.text();
console.error(`MCP Tool: Error from API relay server: ${response.status} ${response.statusText}`, errorText);
return `Error from API relay server: ${response.status} ${response.statusText}`;
}
const data = await response.json();
console.log(`MCP Tool: Received response from API relay server:`, data);
// Extract the assistant's message from the response
const responseData = data as any; // Type assertion
const assistantMessage = responseData.choices[0].message.content;
// Write the file
const filePath = path.resolve(input.path);
await fs.mkdir(path.dirname(filePath), { recursive: true });
await fs.writeFile(filePath, input.content, "utf-8");
// Return success message with the assistant's analysis
return `File successfully written to ${input.path}\n\nAnalysis: ${assistantMessage}`;
} catch (error: any) {
console.error("MCP Tool: Error writing file or sending to API relay server:", error);
return `Error: ${error.message || 'Unknown error'}`;
}
}
}
export default WriteFileTool;

19
mcp-server/tsconfig.json Normal file
View file

@ -0,0 +1,19 @@
{
"compilerOptions": {
"target": "ESNext",
"module": "ESNext",
"moduleResolution": "node",
"outDir": "dist",
"rootDir": "src",
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true,
"forceConsistentCasingInFileNames": true
},
"include": [
"src/*"
],
"exclude": [
"node_modules"
]
}

View file

@ -0,0 +1,29 @@
# Active Context
This file tracks the current session state and goals.
## Current Tasks and Objectives
- Update memory-bank documents based on recent feature implementation and documentation updates:
- [`memory-bank/activeContext.md`](memory-bank/activeContext.md) (this file)
- [`memory-bank/productContext.md`](memory-bank/productContext.md)
- [`memory-bank/progress.md`](memory-bank/progress.md)
- [`memory-bank/decisionLog.md`](memory-bank/decisionLog.md)
## Recent Changes and Decisions
- **Completed**: Implemented a message queuing/dropping system for the `api-relay-server`.
- Modified [`api-relay-server/src/server.ts`](api-relay-server/src/server.ts) with core logic, state variables (`activeExtensionProcessingId`, `newRequestBehavior`, `requestQueue`), `QueuedRequest` interface, `processRequest()` and `finishProcessingRequest()` functions.
- Made `newRequestBehavior` ('queue' or 'drop') configurable via `server-config.json` and Admin UI.
- Updated `ServerConfig` interface, `loadServerConfig()`, `/v1/admin/server-info`, and `/v1/admin/update-settings` in `server.ts`.
- Extended `AdminLogEntry['type']` for new log types.
- **Completed**: Updated frontend [`api-relay-server/src/admin-ui/admin.html`](api-relay-server/src/admin-ui/admin.html) to include UI elements for `newRequestBehavior` and updated relevant JavaScript functions (`fetchAndDisplayServerInfo`, `handleSaveSettings`).
- **Completed**: Updated documentation:
- [`docs/server-architecture.md`](docs/server-architecture.md) to reflect the new queuing system, configuration, and updated diagrams.
- [`docs/user-manual.md`](docs/user-manual.md) to include details on configuring `newRequestBehavior` via the Admin UI and `server-config.json`.
- **Decision**: Proceed with updating the four specified memory-bank documents based on user feedback.
- **Action**: Read the content of [`memory-bank/activeContext.md`](memory-bank/activeContext.md) to prepare for its update.
## Open Questions and Blockers
- None at this time.
## Session-Specific Context
- The session focused on a significant feature enhancement (queuing/dropping system) for the `api-relay-server`, followed by updates to user-facing and architectural documentation.
- Now transitioning to update internal project memory/knowledge base files.

145
memory-bank/decisionLog.md Normal file
View file

@ -0,0 +1,145 @@
# Decision Log
This file records significant architectural decisions, their rationale, and implications for the Chat Relay project.
---
**`[YYYY-MM-DD]` - Initial System Architecture: Three-Component Design**
- **Decision**: The system will be composed of three main components:
1. An OpenAI-Compatible API Server ([`api-relay-server/`](api-relay-server/)).
2. A Browser Extension ([`extension/`](extension/)).
3. An optional MCP (Model Context Protocol) Server ([`mcp-server/`](mcp-server/)).
- **Rationale**:
* Decouples the client application (e.g., Cline/RooCode) from the complexities of web browser automation.
* Provides a standardized API interface for clients.
* Allows the browser extension to focus solely on interacting with specific chat UIs.
* The MCP server offers extendability for developer tools without impacting core relay functionality.
- **Alternatives Considered**:
* *Monolithic Application*: Combining server and extension logic would reduce flexibility and make supporting multiple chat UIs more complex.
* *Direct Client-to-Browser Automation*: Could be less secure and require significant client-side complexity for each supported browser/UI.
- **Implications**: Requires managing inter-component communication (HTTP for client-server, WebSockets for server-extension).
---
**`[YYYY-MM-DD]` - API Standard: OpenAI Compatibility**
- **Decision**: The API Relay Server will expose an OpenAI-compatible endpoint (specifically `/v1/chat/completions`).
- **Rationale**:
* Ensures seamless integration with existing AI development tools like Cline/RooCode that already support the OpenAI API format.
* Lowers the barrier to adoption for users familiar with this standard.
* Standardizes the data format for requests and responses.
- **Alternatives Considered**:
* *Custom API*: Would require bespoke client integrations for each application using the relay, increasing development overhead.
- **Implications**: The server must accurately mimic the expected request/response structure of the OpenAI API.
---
**`[YYYY-MM-DD]` - Server-Extension Communication: WebSockets**
- **Decision**: Communication between the API Relay Server and the Browser Extension will be handled via WebSockets.
- **Rationale**:
* Provides persistent, bidirectional, and real-time communication, essential for promptly relaying chat messages and responses.
* More efficient than HTTP polling for this use case.
- **Alternatives Considered**:
* *HTTP Long Polling/Polling*: Would introduce higher latency and be less efficient for frequent message exchange.
* *Server-Sent Events (SSE)*: Suitable for server-to-client streaming, but WebSockets offer better bidirectional capabilities needed here.
- **Implications**: Requires careful management of WebSocket connection states, heartbeats (ping/pong), and potential reconnections on both server and extension sides.
---
**`[YYYY-MM-DD]` - Browser Extension Technology Stack**
- **Decision**:
* Adhere to Chrome Extension Manifest V3 standards.
* Utilize a Service Worker ([`extension/background.js`](extension/background.js)) for background processing and WebSocket management.
* Employ Content Scripts (e.g., [`extension/content.js`](extension/content.js)) for DOM manipulation and interaction with chat interface pages.
- **Rationale**:
* Manifest V3 is the current standard for Chrome extensions, offering improved security and performance.
* Service workers are the standard for background tasks in Manifest V3.
* Content scripts are necessary for direct interaction with web page content.
- **Implications**: Development must follow Manifest V3 guidelines and lifecycle.
---
**`[YYYY-MM-DD]` - Extension Modularity: Provider-Based Architecture**
- **Decision**: The browser extension will use a modular "provider" architecture to handle interactions with different chat UIs. Each supported chat interface (e.g., Gemini, ChatGPT, AI Studio) will have its own provider script (e.g., [`extension/providers/chatgpt.js`](extension/providers/chatgpt.js)).
- **Rationale**:
* Simplifies adding support for new chat interfaces by encapsulating UI-specific logic within individual provider modules.
* Improves code organization and maintainability within the extension.
* Allows for shared utilities via scripts like [`extension/providers/provider-utils.js`](extension/providers/provider-utils.js).
- **Alternatives Considered**:
* *Single Large Content Script*: Would become unwieldy and difficult to manage as support for more UIs is added. Conditional logic for different UIs would make the code complex.
- **Implications**: Requires a clear interface or convention for providers to adhere to.
---
**`[YYYY-MM-DD]` - Chat Response Capture Mechanism**
- **Decision**: Primarily rely on DOM manipulation and potentially debugger APIs (as suggested in [`README.md`](README.md)) for injecting messages and capturing responses from chat web UIs.
- **Rationale**:
* Direct interaction with the DOM is often the only way to automate web UIs that do not provide external APIs for message exchange.
* Debugger APIs (if used) can offer more robust ways to intercept data or events.
- **Alternatives Considered**:
* *Optical Character Recognition (OCR)*: Too complex, slow, and error-prone for real-time chat.
* *Network Request Sniffing*: May be blocked by HTTPS, difficult to parse consistently, or violate terms of service.
- **Implications**: Capture logic is highly dependent on the specific DOM structure of each chat UI and can be brittle if UIs change frequently. Requires careful selector management and robust error handling.
---
**`[YYYY-MM-DD]` - MCP Server Role: Optional Developer Utility**
- **Decision**: The MCP Server ([`mcp-server/`](mcp-server/)) will be an optional component, primarily serving as a developer utility.
- **Rationale**:
* The core functionality of relaying chat messages does not depend on the MCP server.
* Keeps the primary system simpler for end-users who may not need developer tools.
* Provides valuable tools for testing, simulation, and debugging during development or for advanced users.
- **Implications**: Documentation should clearly state its optional nature. The main system should function correctly without it.
---
**`2025-05-09` - Single Active Request Processing by Extension**
- **Decision**: The API Relay Server will enforce that only one message/request is actively being processed by the connected browser extension at any given time.
- **Rationale**: To prevent race conditions within the browser extension, ensure reliable association of responses to requests, and avoid overloading the extension or the target chat UI. This simplifies state management in the extension.
- **Implementation Details**: A state variable `activeExtensionProcessingId` in [`api-relay-server/src/server.ts`](api-relay-server/src/server.ts) tracks the `requestId` of the job currently with the extension.
- **Implications**: Necessitates a strategy for handling concurrent incoming requests when the extension is busy.
---
**`2025-05-09` - Configurable Behavior for Busy Extension (Queue/Drop)**
- **Decision**: When the browser extension is busy, the API Relay Server's behavior for new incoming requests will be configurable: either 'queue' or 'drop'.
- **Rationale**:
* 'Queue': Ensures all requests are eventually processed, suitable for non-interactive or batch tasks.
* 'Drop': Provides immediate feedback (429 error) for interactive tasks, preventing long client wait times or large queue build-ups. Offers users control based on their needs.
- **Implementation Details**:
* A global variable `newRequestBehavior: 'queue' | 'drop'` in [`api-relay-server/src/server.ts`](api-relay-server/src/server.ts).
* An in-memory `requestQueue: QueuedRequest[]` to hold requests when behavior is 'queue'.
* Logic within the `/v1/chat/completions` endpoint to check `activeExtensionProcessingId` and `newRequestBehavior`.
- **Alternatives Considered**:
* *Always Queue*: Simpler, but could lead to very long wait times for clients if the queue grows large.
* *Always Drop*: Simpler, but might lose important non-interactive requests.
- **Implications**: The chosen behavior directly impacts client experience and system throughput under load.
---
**`2025-05-09` - Configuration Mechanism for Request Handling Behavior**
- **Decision**: The `newRequestBehavior` setting, along with existing settings like `port` and `requestTimeoutMs`, will be configurable via:
1. A `server-config.json` file (stored in `api-relay-server/dist/`).
2. An Admin Web UI served by the API Relay Server.
- **Rationale**:
* `server-config.json` allows for persistent configuration across server restarts.
* The Admin UI provides a user-friendly interface for viewing and modifying these settings without direct file manipulation, making it more accessible.
* Changes to `newRequestBehavior` and `requestTimeoutMs` via the Admin UI are effective immediately. Port changes require a restart.
- **Implementation Details**:
* Updates to `loadServerConfig()` and `saveServerConfig()` in [`api-relay-server/src/server.ts`](api-relay-server/src/server.ts).
* Enhancements to `/v1/admin/server-info` and `/v1/admin/update-settings` API endpoints.
* UI elements (radio buttons for `newRequestBehavior`) and corresponding JavaScript logic in [`api-relay-server/src/admin-ui/admin.html`](api-relay-server/src/admin-ui/admin.html).
- **Implications**: Adds complexity to the server for handling configuration loading, saving, and UI interaction, but significantly improves usability and control.
---
**`2025-05-09` - In-Memory Request Queue**
- **Decision**: The request queue for the 'queue' behavior will be implemented as an in-memory array within the API Relay Server.
- **Rationale**:
* Sufficient for the current expected load and use case of the system.
* Simplifies the implementation by avoiding external dependencies (e.g., Redis, RabbitMQ) for queuing, which would be overkill.
- **Alternatives Considered**:
* *Persistent Queue (e.g., Redis-backed)*: Would provide durability across server restarts but adds operational complexity and dependencies not justified at this stage.
- **Implications**: Queued requests are volatile and will be lost if the API Relay Server restarts. The queue size is limited by available server memory.
---
**`2025-05-09` - Deferred HTTP Response for Queued Requests**
- **Decision**: When a request is added to the queue, its corresponding HTTP response to the client (e.g., Cline/RooCode) will be deferred. The response will only be sent once the request is dequeued, processed by the browser extension, and a result (or error/timeout) is obtained.
- **Rationale**: This approach maintains the synchronous-like interaction model expected by clients using an OpenAI-compatible API. The client sends one request and waits for one eventual response, abstracting the queuing mechanism.
- **Implementation Details**: The Express `res` (response) object is stored as part of the `QueuedRequest` object in the `requestQueue`. When `processRequest` handles a dequeued item, it uses this stored `res` object to send the final HTTP response.
- **Implications**: Client connections will be held open longer for queued requests. Requires careful management of the `res` object to ensure it's not prematurely closed or written to.
---
**`2025-05-10` - Claude Provider Debugger URL Pattern Confirmation**
- **Decision**: Confirmed that `debuggerUrlPattern: "*/completion*"` in `extension/providers/claude.js` is the correct pattern for intercepting Claude's chat SSE stream.
- **Rationale**: Initial assumptions that Claude might use a different endpoint (e.g., `*/api/append_message*`) for its chat stream were incorrect. Browser network inspection during testing clearly showed the SSE stream originating from a URL matching `*/completion*`.
- **Implications**: Debugger attachment relies on this pattern. Future changes by Claude to this endpoint URL would require updating this pattern.
---
**`2025-05-10` - Claude Provider SSE Stream Parsing for End-of-Message**
- **Decision**: Refined the `parseDebuggerResponse` method in `extension/providers/claude.js` to robustly detect end-of-message signals from Claude's SSE stream.
- **Rationale**: The provider was not consistently sending the complete message back to the application. The fix involved:
- Ensuring direct detection of `event: message_stop`.
- Correctly parsing `event: message_delta` for a `stop_reason` in its JSON data.
- Accumulating text from `content_block_delta` events until one of these end-of-message signals is received.
- **Implications**: Improves reliability of message completion for the Claude provider. The parsing logic is specific to Claude's current SSE structure.
---
**`2025-05-10` - Disabling Function Calling Logic for Claude Provider**
- **Decision**: Commented out the `ensureFunctionCallingEnabled` method and its invocations within `extension/providers/claude.js`.
- **Rationale**: The Claude web interface (`claude.ai`) does not currently present a user-toggleable option for "Function calling" similar to what might be found in other AI platforms (like AI Studio, for which this feature was originally designed). Attempting to find and click a non-existent toggle was unnecessary and cluttered console logs.
- **Implications**: If Claude introduces such a UI feature in the future, this code would need to be uncommented and potentially adapted. For now, it simplifies the provider's initialization.
---
*(Note: Replace `[YYYY-MM-DD]` with actual decision dates or approximate dates when these architectural choices were likely made based on project evolution.)*

View file

@ -0,0 +1,76 @@
# Product Context
This file defines the project scope, core knowledge, component architecture, technical standards, and key dependencies for the Chat Relay system.
## Project Overview and Goals
**Chat Relay** is a system designed to enable Cline/RooCode (or other AI development applications) to communicate with various web-based AI chat interfaces (such as Gemini, AI Studio, ChatGPT, and Claude) via an OpenAI-compatible API.
**Primary Goals**:
- Provide an OpenAI-compatible API endpoint for seamless integration with tools like Cline/RooCode.
- Relay messages to and from web-based chat UIs that may not have public APIs or offer different capabilities through their web interfaces.
- Enable interaction with multiple AI chat providers through a single, consistent API.
- Offer a modular architecture to easily support new chat providers in the future.
- Ensure robust handling of concurrent requests to prevent overloading browser extensions and maintain response integrity.
## Component Architecture
The system comprises three main components:
1. **API Relay Server** ([`api-relay-server/`](api-relay-server/)):
* **Purpose**: Implements an OpenAI-compatible API endpoint (`/v1/chat/completions`) that client applications (e.g., Cline/RooCode) connect to. It manages WebSocket connections with the Browser Extension for real-time message relay. It now also includes a configurable message queuing/dropping system (`newRequestBehavior`) to manage request flow to the browser extension, ensuring only one message is actively processed by the extension at a time.
* **Key Files**: [`api-relay-server/src/server.ts`](api-relay-server/src/server.ts) (main server logic, including Express app, WebSocket server, queuing system, and admin UI).
2. **Browser Extension** ([`extension/`](extension/)):
* **Purpose**: Runs in the user's browser (Chrome). It connects to the API Relay Server via WebSocket. It injects user messages into the target chat interface's DOM, simulates sending, and captures the AI's response from the UI.
* **Key Files**:
* [`extension/manifest.json`](extension/manifest.json): Defines extension properties, permissions, and scripts.
* [`extension/background.js`](extension/background.js): Service worker managing WebSocket connection and core relay logic.
* [`extension/content.js`](extension/content.js): Injected into chat interface pages to interact with the DOM.
* Provider-specific scripts (e.g., [`extension/providers/chatgpt.js`](extension/providers/chatgpt.js), [`extension/providers/aistudio.js`](extension/providers/aistudio.js), [`extension/providers/gemini.js`](extension/providers/gemini.js), [`extension/providers/claude.js`](extension/providers/claude.js)): Handle interactions for each supported chat UI.
* [`extension/providers/provider-utils.js`](extension/providers/provider-utils.js): Common utility functions for providers.
3. **MCP Server** ([`mcp-server/`](mcp-server/)):
* **Purpose**: An optional Model Context Protocol server that can provide additional developer utilities, such as tools for simulating messages, testing extension interactions, or viewing communication traffic.
* **Key Files**: [`mcp-server/src/index.ts`](mcp-server/src/index.ts) (main server logic).
**Data Flow**:
1. Cline/RooCode sends an HTTP POST request (OpenAI format) to the API Relay Server (`/v1/chat/completions`).
2. The API Relay Server checks if a browser extension is currently processing a message (`activeExtensionProcessingId`).
* If the extension is busy:
* If `newRequestBehavior` is 'drop', the server responds with a 429 error.
* If `newRequestBehavior` is 'queue', the request (including the HTTP `res` object) is added to an in-memory `requestQueue`, and the HTTP response is deferred.
* If the extension is free, the server proceeds to step 3 directly with the current request.
3. The API Relay Server (via `processRequest` function) sends the message content to the connected Browser Extension via WebSocket.
4. The Browser Extension (using its content scripts and providers) injects the message into the active chat interface (e.g., Gemini) and simulates sending.
5. The Extension captures the response generated by the chat interface from the UI.
6. The Extension sends the captured response back to the API Relay Server via WebSocket.
7. The API Relay Server (within `processRequest`) receives the response, formats it into an OpenAI-compatible JSON structure, and returns it to Cline/RooCode (using the original or stored `res` object).
8. `finishProcessingRequest` is called, clearing the `activeExtensionProcessingId`. If requests are queued and `newRequestBehavior` is 'queue', the next request is dequeued and processed starting from step 3.
## Technical Standards
- **API Relay Server & MCP Server**:
* Language: TypeScript. Indicated by [`api-relay-server/tsconfig.json`](api-relay-server/tsconfig.json), [`mcp-server/tsconfig.json`](mcp-server/tsconfig.json) and `.ts` files.
* Web Server Framework (API Relay): Express.js (from dependencies).
* WebSocket Communication: `ws` library.
* MCP Framework (MCP Server): `mcp-framework` library.
- **Browser Extension**:
* Chrome Extension Manifest V3.
* Core Logic: Vanilla JavaScript.
* Communication: Native Browser WebSocket API to connect to the API Relay Server.
* DOM Manipulation for UI interaction.
- **General**:
* Adherence to OpenAI API structure for `/v1/chat/completions` endpoint.
* Use of `npm` for package management.
* Configuration for API Relay Server via `server-config.json` and Admin UI.
## Key Dependencies
- **API Relay Server** ([`api-relay-server/package.json`](api-relay-server/package.json)):
* `express`: Web framework.
* `ws`: WebSocket library.
* `cors`: Cross-Origin Resource Sharing middleware.
* `body-parser`: Request body parsing middleware.
* `typescript`, `nodemon` (dev).
- **MCP Server** ([`mcp-server/package.json`](mcp-server/package.json)):
* `mcp-framework`: Core framework for MCP server development.
* `ws`: WebSocket library.
* `node-fetch`: For making HTTP requests.
* `zod`: Schema validation.
* `typescript` (dev).
- **Browser Extension** ([`extension/manifest.json`](extension/manifest.json)):
* Relies on native browser APIs (DOM, WebSocket, Storage, Alarms, Scripting, Debugger, Tabs).
* No explicit third-party JavaScript libraries listed in the manifest; likely self-contained or using utility scripts like [`extension/providers/provider-utils.js`](extension/providers/provider-utils.js).
This document should be updated if the project's scope, architecture, or core technologies change significantly.

59
memory-bank/progress.md Normal file
View file

@ -0,0 +1,59 @@
# Progress Log
This file tracks tasks, their status, and any significant changes.
## Completed Work Items
- `[Initial Setup]` - Initial Memory Bank setup (as per previous log).
- `[YYYY-MM-DD HH:MM:SS]` - Updated [`memory-bank/activeContext.md`](memory-bank/activeContext.md) to reflect current session goals of updating memory bank documents.
- `[YYYY-MM-DD HH:MM:SS]` - Updated [`memory-bank/productContext.md`](memory-bank/productContext.md) with project overview, component architecture, technical standards, and key dependencies based on project files like [`README.md`](README.md), [`api-relay-server/package.json`](api-relay-server/package.json), [`mcp-server/package.json`](mcp-server/package.json), and [`extension/manifest.json`](extension/manifest.json).
- `[YYYY-MM-DD HH:MM:SS]` - Updated [`memory-bank/decisionLog.md`](memory-bank/decisionLog.md) with inferred architectural and technical decisions.
- `[YYYY-MM-DD HH:MM:SS]` - Initial update of the four core memory-bank documents completed and reviewed.
- `[YYYY-MM-DD HH:MM:SS]` - Created directory [`api-relay-server/src/admin-ui/`](api-relay-server/src/admin-ui/) using the `filesystem` MCP tool.
- `[YYYY-MM-DD HH:MM:SS]` - Created initial HTML structure for the admin dashboard at [`api-relay-server/src/admin-ui/admin.html`](api-relay-server/src/admin-ui/admin.html).
- `[YYYY-MM-DD HH:MM:SS]` - Modified [`api-relay-server/src/server.ts`](api-relay-server/src/server.ts) to serve static files for the admin UI and added an `/admin` route for [`admin.html`](api-relay-server/src/admin-ui/admin.html).
- **Feature Implementation: Message Queuing/Dropping System (Completed `2025-05-09`)**
- Implemented core queuing/dropping logic in [`api-relay-server/src/server.ts`](api-relay-server/src/server.ts).
- Added state variables (`activeExtensionProcessingId`, `newRequestBehavior`, `requestQueue`), `QueuedRequest` interface.
- Created `processRequest()` and `finishProcessingRequest()` functions.
- Made `newRequestBehavior` ('queue'/'drop') configurable via `server-config.json`.
- Updated `ServerConfig` interface, `loadServerConfig()`.
- Extended `AdminLogEntry['type']` for new log types (`CHAT_REQUEST_QUEUED`, `CHAT_REQUEST_DROPPED`, etc.).
- Removed duplicate `/admin/update-settings` route.
- **Admin UI Enhancements (Completed `2025-05-09`)**
- Updated `/v1/admin/server-info` to include `newRequestBehavior`.
- Updated `/v1/admin/update-settings` to accept and save `newRequestBehavior`.
- Modified [`api-relay-server/src/admin-ui/admin.html`](api-relay-server/src/admin-ui/admin.html):
- Added UI elements (radio buttons) for `newRequestBehavior`.
- Updated `fetchAndDisplayServerInfo()` to populate the new UI element.
- Updated `handleSaveSettings()` to include `newRequestBehavior` in the payload.
- **Documentation Updates (Completed `2025-05-09`)**
- Updated [`docs/server-architecture.md`](docs/server-architecture.md) with details of the queuing system, new configurations, and revised diagrams.
- Updated [`docs/user-manual.md`](docs/user-manual.md) with information on configuring `newRequestBehavior` via Admin UI and `server-config.json`.
- **Memory Bank Update (In Progress `2025-05-09`)**
- Updated [`memory-bank/activeContext.md`](memory-bank/activeContext.md) with recent changes.
- Updated [`memory-bank/productContext.md`](memory-bank/productContext.md) with architectural changes.
- **Claude Provider Integration & Debugging (Completed `2025-05-10`)**
- Successfully debugged issues with the Claude provider (`extension/providers/claude.js`) related to SSE stream parsing and end-of-message detection.
- Confirmed `debuggerUrlPattern: "*/completion*"` is correct for Claude's streaming endpoint.
- Refined `parseDebuggerResponse` to correctly identify `message_stop` and `message_delta` with `stop_reason` events.
- Added detailed logging to `handleDebuggerData` and `parseDebuggerResponse` for improved diagnostics.
- Commented out `ensureFunctionCallingEnabled` method and its invocations as it's not currently applicable to Claude.
- **Documentation for Claude Provider (Completed `2025-05-10`)**
- Created `docs/provider-claude.md` detailing the Claude provider's configuration and functionality.
- Updated `docs/consolidated-provider-documentation.md` to include a section for the Claude provider.
- Updated `docs/provider-comparison.md` to add Claude to the comparison tables and descriptions.
- Updated `docs/user-manual.md` to list Claude as a supported interface and model.
## Current Tasks
- `[In Progress]` - Updating [`memory-bank/progress.md`](memory-bank/progress.md) (this document).
- `[Pending]` - Updating [`memory-bank/decisionLog.md`](memory-bank/decisionLog.md) with decisions made during the queuing system implementation and Claude provider debugging.
- `[Pending]` - Updating [`memory-bank/productContext.md`](memory-bank/productContext.md) to reflect Claude provider support.
## Next Steps
- Complete updates for `decisionLog.md` and `productContext.md` in the memory-bank.
- **Testing**: Perform thorough testing of the message queuing ('queue' and 'drop' behaviors) and dropping system as outlined in the original task's "Testing Considerations."
- **Admin UI Real-time Updates**: Consider extending WebSocket functionality to push real-time updates (new messages, log entries, status changes) to the Admin UI, rather than relying solely on polling or tab-switching refreshes for some data.
## Known Issues
- The Admin UI fetches data on tab activation or manual refresh for message history. Real-time push updates for logs or status changes (beyond connected extensions count on `/server-info` refresh) are not yet implemented.
*(Note: Replace `[YYYY-MM-DD HH:MM:SS]` with actual timestamps upon completion of each item.)*

View file

@ -0,0 +1,4 @@
# System Patterns
This file documents the architectural patterns used in the project.
## Implemented Patterns
## Considered Patterns

1650
package-lock.json generated Normal file

File diff suppressed because it is too large Load diff