* [feature] Block Google Bard/AI crawlers
* [feature] Block the other OpenAI crawler
* [feature] Block Common Crawl crawler
This is used in research, but also gleefully advertises itself as the
training source used in all LLMs and GPT-3.
Fixes: #2240
* [feature] Block Omgilikebot
Used by some shady big web data engine company.
* [feature] Block Meta's language model crawler
* [feature] Block well-known.dev crawler
* use minID properly for public timeline
* return paged response properly even when 0 items
* use gtserror
* page more consistently (for now)
* test
* aaa
* update typeconverter to use state structure
* deinterface the typeutils.TypeConverter -> typeutils.Converter
* finish copying over old type converter code comments
* fix cherry-pick merge issues, fix tests pointing to old typeutils interface type still
* love like winter! wohoah, wohoah
* domain allow side effects
* tests! logging! unallow!
* document federation modes
* linty linterson
* test
* further adventures in documentation
* finish up domain block documentation (i think)
* change wording a wee little bit
* docs, example
* consolidate shared domainPermission code
* call mode once
* fetch federation mode within domain blocked func
* read domain perm import in streaming manner
* don't use pointer to slice for domain perms
* don't bother copying blocks + allows before deleting
* admonish!
* change wording just a scooch
* update docs
* [feature] Support Actor URIs for webfinger queries
It's now possible to pass an Actor URI as the resource to query for when
doing a webfinger query. The code now extracts the username and domain
from the URI. The URI needs to be fully qualified, including having a
scheme of http or https to be recognised as such.
The acct scheme is handled as we used to, including dealing with an
erroneous leading @ on the username. We retain the ability to handle
resources without a scheme by parsing them again with the acct scheme if
the original parse failed. This can happen due to parsing ambiguities
when dealing with a string like user@domain.tld:port.
* [bugfix] Remove debugging changes
* [chore] Make TestExtractNamestring table-driven
* [chore] Unnest Trim and Split for readability
* [feature] Add http trace exporter, drop Jaeger
Jaeger supports ingesting traces using the OpenTelemetry gRPC or HTTP
methods. The Jaeger project has deprecated the old jaeger transport.
* Add support for submitting traces over HTTP
* Drop support for the old Jaeger protocol
* Upgrade the trace libraries to v1.17
Fixes: #2176Fixes: #2179
c.FullPath() is the empty string if a request doesn't match any route on
our mux. In those cases, there's no value in emitting a trace. The trace
will be empty, containing no other information beyond the fact that we
didn't match a route. Since Gin breaks off the processing early we don't
need to trace this request as it won't do anything and consumes no
further resources.
The 404 will still be emitted by our logs and will be visible from a
reverse proxy too.
* move SQLite pragmas into connection string
Signed-off-by: kim <grufwub@gmail.com>
* use url.Values type for SQLite connection preferences
Signed-off-by: kim <grufwub@gmail.com>
* set SQLite URI prefs properly using _pragma query key
Signed-off-by: kim <grufwub@gmail.com>
* add notes on SQLite connection preferences
Signed-off-by: kim <grufwub@gmail.com>
* fix typo
Signed-off-by: kim <grufwub@gmail.com>
* add one extra line regarding connection pooling
Signed-off-by: kim <grufwub@gmail.com>
---------
Signed-off-by: kim <grufwub@gmail.com>
* wrap bun.Tx to add our own error processing
Signed-off-by: kim <grufwub@gmail.com>
* add compile-time check for updateRowError() compatibility with sql.Row, fix wrapTx() not being used properly
Signed-off-by: kim <grufwub@gmail.com>
---------
Signed-off-by: kim <grufwub@gmail.com>
* [feature] list commands for both attachment and emojis
* use fewer commands, provide `local-only` and `remote-only` as filters
* envparsing
---------
Co-authored-by: Romain de Laage <romain.delaage@rdelaage.ovh>
Co-authored-by: tsmethurst <tobi.smethurst@protonmail.com>
* [feature] Don't emit timestamp in log lines
When running gotosocial with a service manager like systemd, or a
container runtime, the associated log driver usually emits timestamps
itself. In those cases, having the extra timestamp from our own log
lines ends up being a bit noisy and when centrally ingesting logs is
duplicate information.
This introduces a configuration flag that allows disabling emitting the
timestamp. It's only wired up for "daemonised" processes, meaning server
and testrig.
* [chore] Add docs for log-timestamp
* [feature] Simplify timestamp handling
Co-Authored-By: kim <89579420+NyaaaWhatsUpDoc@users.noreply.github.com>
* [chore] Less escaped double-quotes
* [chore] Fix help string
---------
Co-authored-by: kim <89579420+NyaaaWhatsUpDoc@users.noreply.github.com>
* init instance rules database model, admin api
* expose instance rules in public instance api
* public /api/v1/instance/rules route
* GET ruleById
* createRule route
* createRule auth check
* updateRule
* deleteRule
* list rules on about page
* ruleGet auth
* add about page ids for anchors
* process and store adding violated rules to reports
* admin api models for instance rules
* instance rule edit frontend
* change rule inputs to textareas
* database fixes after rebase (#2124)
* remove unused imports
* fix db migration column name
* fix tests
* fix more tests
* fix postgres error with wrongly used Ident
* add some tests, fiddle with rule model a bit, fix postgres migration
* swagger docs
---------
Co-authored-by: tsmethurst <tobi.smethurst@protonmail.com>
* use calculated exampleTime instead of `time.Now()` to ensure no locale data, retweak cache ratios
* update envparsing test
* update default cache memory to 100MiB
* fix envparsing with latest cache target default
---------
Signed-off-by: kim <grufwub@gmail.com>
This adds the CSP header with a policy of only loading from the same
domain. We don't make use of external media, CSS, JS, fonts, so we don't
ever need external data loaded in our context.
When building a DEBUG build, the policy gets extended to include
localhost:*, i.e localhost on any port. This keeps the live-reloading
flow for JS development working. localhost and 127.0.0.1 are considered
to be the same so mixing and matching those doesn't result in a CSP
violation.
* Add/update some DB functions.
* move async workers into subprocessor
* rename FromFederator -> FromFediAPI
* update home timeline check to include check for current status first before moving to parent status
* change streamMap to pointer to mollify linter
* update followtoas func signature
* fix merge
* remove errant debug log
* don't use separate errs.Combine() check to wrap errs
* wrap parts of workers functionality in sub-structs
* populate report using new db funcs
* embed federator (tiny bit tidier)
* flesh out error msg, add continue(!)
* fix other error messages to be more specific
* better, nicer
* give parseURI util function a bit more util
* missing headers
* use pointers for subprocessors
* Allow full BCP 47 in language inputs
Fixes#2066
* Fuse validation and normalization for languages
* Remove outdated comment line
* Move post language canonicalization test
* [chore] Remove go-playground/validator
It turns out we're not actually using the validator code. This is a
remnant from when we intended to use it, but the presence of it and its
struct tags creates the illusion we're validating a lot of things we're
not. It resulted in some confusion when we were trying to figure out
language valdiation.
Remove all this code, so that only the validation functions from the
validate package we actually use remain. I'm not touching the struct
tags in the migrations in order to avoid things potentially thinking
migrations need to be re-run.
* [chore] Bring back a struct tag on api
The validate on internal/api is Gin doing form validation, not the
validator from go-playground/validator.