Adds a new operation findDistinct that can give you distinct values of a
field for a given collection
Example:
Assume you have a collection posts with multiple documents, and some of
them share the same title:
```js
// Example dataset (some titles appear multiple times)
[
{ title: 'title-1' },
{ title: 'title-2' },
{ title: 'title-1' },
{ title: 'title-3' },
{ title: 'title-2' },
{ title: 'title-4' },
{ title: 'title-5' },
{ title: 'title-6' },
{ title: 'title-7' },
{ title: 'title-8' },
{ title: 'title-9' },
]
```
You can now retrieve all unique title values using findDistinct:
```js
const result = await payload.findDistinct({
collection: 'posts',
field: 'title',
})
console.log(result.values)
// Output:
// [
// 'title-1',
// 'title-2',
// 'title-3',
// 'title-4',
// 'title-5',
// 'title-6',
// 'title-7',
// 'title-8',
// 'title-9'
// ]
```
You can also limit the number of distinct results:
```js
const limitedResult = await payload.findDistinct({
collection: 'posts',
field: 'title',
sortOrder: 'desc',
limit: 3,
})
console.log(limitedResult.values)
// Output:
// [
// 'title-1',
// 'title-2',
// 'title-3'
// ]
```
You can also pass a `where` query to filter the documents.
### What?
Adds four more arguments to the `mongooseAdapter`:
```typescript
useJoinAggregations?: boolean /* The big one */
useAlternativeDropDatabase?: boolean
useBigIntForNumberIDs?: boolean
usePipelineInSortLookup?: boolean
```
Also export a new `compatabilityOptions` object from
`@payloadcms/db-mongodb` where each key is a mongo-compatible database
and the value is the recommended `mongooseAdapter` settings for
compatability.
### Why?
When using firestore and visiting
`/admin/collections/media/payload-folders`, we get:
```
MongoServerError: invalid field(s) in lookup: [let, pipeline], only lookup(from, localField, foreignField, as) is supported
```
Firestore doesn't support the full MongoDB aggregation API used by
Payload which gets used when building aggregations for populating join
fields.
There are several other compatability issues with Firestore:
- The invalid `pipeline` property is used in the `$lookup` aggregation
in `buildSortParams`
- Firestore only supports number IDs of type `Long`, but Mongoose
converts custom ID fields of type number to `Double`
- Firestore does not support the `dropDatabase` command
- Firestore does not support the `createIndex` command (not addressed in
this PR)
### How?
```typescript
useJoinAggregations?: boolean /* The big one */
```
When this is `false` we skip the `buildJoinAggregation()` pipeline and resolve the join fields through multiple queries. This can potentially be used with AWS DocumentDB and Azure Cosmos DB to support join fields, but I have not tested with either of these databases.
```typescript
useAlternativeDropDatabase?: boolean
```
When `true`, monkey-patch (replace) the `dropDatabase` function so that
it calls `collection.deleteMany({})` on every collection instead of
sending a single `dropDatabase` command to the database
```typescript
useBigIntForNumberIDs?: boolean
```
When `true`, use `mongoose.Schema.Types.BigInt` for custom ID fields of type `number` which converts to a firestore `Long` behind the scenes
```typescript
usePipelineInSortLookup?: boolean
```
When `false`, modify the sortAggregation pipeline in `buildSortParams()` so that we don't use the `pipeline` property in the `$lookup` aggregation. Results in slightly worse performance when sorting by relationship properties.
### Limitations
This PR does not add support for transactions or creating indexes in firestore.
### Fixes
Fixed a bug (and added a test) where you weren't able to sort by multiple properties on a relationship field.
### Future work
1. Firestore supports simple `$lookup` aggregations but other databases might not. Could add a `useSortAggregations` property which can be used to disable aggregations in sorting.
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Sasha <64744993+r1tsuu@users.noreply.github.com>
Based on https://github.com/payloadcms/payload/pull/13060 which should
be merged first
This PR adds ability to update number fields atomically, which could be
important with parallel writes. For now we support this only via
`payload.db.updateOne`.
For example:
```js
// increment by 10
const res = await payload.db.updateOne({
data: {
number: {
$inc: 10,
},
},
collection: 'posts',
where: { id: { equals: post.id } },
})
// decrement by 3
const res2 = await payload.db.updateOne({
data: {
number: {
$inc: -3,
},
},
collection: 'posts',
where: { id: { equals: post.id } },
})
```
Fixes https://github.com/payloadcms/payload/issues/13045
`updateOne` when returning is `false` mutates the data object for write
operations on the DB, but that causes an issue when using that data
object later on since all of the id's are mutated to objectIDs and never
transformed back into read id's.
This fix ensures that the transform happens even when the result is not
returned.
If you (using the MongoDB adapter) delete a block from the payload
config, but still have some data with that block in the DB, you'd
receive in the admin panel an error like:
```
Block with type "cta" was found in block data, but no block with that type is defined in the config for field with schema path pages.blocks
```
Now, we remove those "unknown" blocks at the DB adapter level.
Co-authored-by: Dan Ribbens <dan.ribbens@gmail.com>
Previously, there were multiple ways to type a running job:
- `GeneratedTypes['payload-jobs']` - only works in an installed project
- is `any` in monorepo
- `BaseJob` - works everywhere, but does not incorporate generated types
which may include type for custom fields added to the jobs collection
- `RunningJob<>` - more accurate version of `BaseJob`, but same problem
This PR deprecated all those types in favor of a new `Job` type.
Benefits:
- Works in both monorepo and installed projects. If no generated types
exist, it will automatically fall back to `BaseJob`
- Comes with an optional generic that can be used to narrow down
`job.input` based on the task / workflow slug. No need to use a separate
type helper like `RunningJob<>`
With this new type, I was able to replace every usage of
`GeneratedTypes['payload-jobs']`, `BaseJob` and `RunningJob<>` with the
simple `Job` type.
Additionally, this PR simplifies some of the logic used to run jobs
Currently, we globally enable both DOM and Node.js types. While this
mostly works, it can cause conflicts - particularly with `fetch`. For
example, TypeScript may incorrectly allow browser-only properties (like
`cache`) and reject valid Node.js ones like `dispatcher`.
This PR disables DOM types for server-only packages like payload,
ensuring Node-specific typings are applied. This caught a few instances
of incorrect fetch usage that were previously masked by overlapping DOM
types.
This is not a perfect solution - packages that contain both server and
client code (like richtext-lexical or next) will still suffer from this
issue. However, it's an improvement in cases where we can cleanly
separate server and client types, like for the `payload` package which
is server-only.
## Use-case
This change enables https://github.com/payloadcms/payload/pull/12622 to
explore using node-native fetch + `dispatcher`, instead of `node-fetch`
+ `agent`.
Currently, it will incorrectly report that `dispatcher` is not a valid
property for node-native fetch
I noticed a few issues when running e2e tests that will be resolved by
this PR:
- Most important: for some test suites (fields, fields-relationship,
versions, queues, lexical), the database was cleared and seeded
**twice** in between each test run. This is because the onInit function
was running the clear and seed script, when it should only have been
running the seed script. Clearing the database / the snapshot workflow
is being done by the reInit endpoint, which then calls onInit to seed
the actual data.
- The slowest part of `clearAndSeedEverything` is recreating indexes on
mongodb. This PR slightly improves performance here by:
- Skipping this process for the built-in `['payload-migrations',
'payload-preferences', 'payload-locked-documents']` collections
- Previously we were calling both `createIndexes` and `ensureIndexes`.
This was unnecessary - `ensureIndexes` is a deprecated alias of
`createIndexes`. This PR changes it to only call `createIndexes`
- Makes the reinit endpoint accept GET requests instead of POST requests
- this makes it easier to debug right in the browser
- Some typescript fixes
- Adds a `dev:memorydb` script to the package.json. For some reason,
`dev` is super unreliable on mongodb locally when running e2e tests - it
frequently fails during index creation. Using the memorydb fixes this
issue, with the bonus of more closely resembling the CI environment
- Previously, you were unable to run test suites using turbopack +
postgres. This fixes it, by explicitly installing `pg` as devDependency
in our monorepo
- Fixes jest open handles warning
This PR introduces a few changes to improve turbopack compatibility and
ensure e2e tests pass with turbopack enabled
## Changes to improve turbopack compatibility
- Use correct sideEffects configuration to fix scss issues
- Import scss directly instead of duplicating our scss rules
- Fix some scss rules that are not supported by turbopack
- Bump Next.js and all other dependencies used to build payload
## Changes to get tests to pass
For an unknown reason, flaky tests flake a lot more often in turbopack.
This PR does the following to get them to pass:
- add more `wait`s
- fix actual flakes by ensuring previous operations are properly awaited
## Blocking turbopack bugs
- [X] https://github.com/vercel/next.js/issues/76464
- Fix PR: https://github.com/vercel/next.js/pull/76545
- Once fixed: change `"sideEffectsDisabled":` back to `"sideEffects":`
## Non-blocking turbopack bugs
- [ ] https://github.com/vercel/next.js/issues/76956
## Related PRs
https://github.com/payloadcms/payload/pull/12653https://github.com/payloadcms/payload/pull/12652
The databases do not keep track of document order internally so when
sorting by non-unique fields such as shared `order` number values, the
returned order will be random and not consistent.
While this issue is far more noticeable on mongo it could also occur in
postgres on certain environments.
This combined with pagination can lead to the perception of duplicated
or inconsistent data.
This PR adds a second sort parameter to queries so that we always have a
fallback, `-createdAt` will be used by default or `id` if timestamps are
disabled.
I think it's easier to review this PR commit by commit, so I'll explain
it this way:
## Commits
1. [parallelize eslint script (still showing logs results in
serial)](c9ac49c12d):
Previously, `--concurrency 1` was added to the script to make the logs
more readable. However, turborepo has an option specifically for these
use cases: `--log-order=grouped` runs the tasks in parallel but outputs
them serially. As a result, the lint script is now significantly faster.
2. [run pnpm
lint:fix](9c128c276a)
The auto-fix was run, which resolved some eslint errors that were
slipped in due to the use of `no-verify`. Most of these were
`perfectionist` fixes (property ordering) and the removal of unnecessary
assertions. Starting with this PR, this won't happen again in the
future, as we'll be verifying the linter in every PR across the entire
codebase (see commit 7).
3. [fix eslint non-autofixable
errors](700f412a33)
All manual errors have been resolved except for the configuration errors
addressed in commit 5. Most were React compiler violations, which have
been disabled and commented out "TODO" for now. There's also an unused
`use no memo` and a couple of `require` errors.
4. [move react-compiler linter to eslint-config
package](4f7cb4d63a)
To simplify the eslint configuration. My concern was that there would be
a performance regression when used in non-react related packages, but
none was experienced. This is probably because it only runs on .tsx
files.
5. [remove redundant eslint config files and fix
allowDefaultProject](a94347995a)
The main feature introduced by `typescript-eslint` v8 was
`projectService`, which automatically searches each file for the closest
`tsconfig`, greatly simplifying configuration in monorepos
([source](https://typescript-eslint.io/blog/announcing-typescript-eslint-v8#project-service)).
Once I moved `projectService` to `packages/eslint-config`, all the other
configuration files could be easily removed.
I confirmed that pnpm lint still works on individual packages.
The other important change was that the pending eslint errors from
commits 2 and 3 were resolved. That is, some files were giving the
error: "[File] was not found by the project service. Consider either
including it in the tsconfig.json or including it in
allowDefaultProject." Below I copy the explanatory comment I left in the
code:
```ts
// This is necessary because `tsconfig.base.json` defines `"rootDir": "${configDir}/src"`,
// And the following files aren't in src because they aren't transpiled.
// This is typescript-eslint's way of adding files that aren't included in tsconfig.
// See: https://typescript-eslint.io/troubleshooting/typed-linting/#i-get-errors-telling-me--was-not-found-by-the-project-service-consider-either-including-it-in-the-tsconfigjson-or-including-it-in-allowdefaultproject
// The best practice is to have a tsconfig.json that covers ALL files and is used for
// typechecking (with noEmit), and a `tsconfig.build.json` that is used for the build
// (or alternatively, swc, tsup or tsdown). That's what we should ideally do, in which case
// this hardcoded list wouldn't be necessary. Note that these files don't currently go
// through ts, only through eslint.
```
6. [Differentiate errors from warnings in VScode ESLint
Rules](5914d2f48d)
There's no reason to do that. If an eslint rule isn't an error, it
should be disabled or converted to a warning.
7. [Disable skip lint, and lint over the entire repo now that it's
faster](e4b28f1360)
The GitHub action linted only the files that had changed in the PR.
While this seems like a good idea, once exceptions were introduced with
[skip lint], they opened the door to propagating more and more errors.
Often, the linter was skipped, not because someone introduced new
errors, but because they were trying to avoid those that had already
crept in, sometimes accidentally introducing new ones.
On the other hand, `pnpm lint` now runs in parallel (commit 1), so it's
not that slow. Additionally, it runs in parallel with other GitHub
actions like e2e tests, which take much longer, so it can't represent a
bottleneck in CI.
8. [fix lint in next
package](4506595f91)
Small fix missing from commit 5
9. [Merge remote-tracking branch 'origin/main' into
fix-eslint](563d4909c1)
10. [add again eslint.config.js in payload
package](78f6ffcae7)
The comment in the code explains it. Basically, after the merge from
main, the payload package runs out of memory when linting, probably
because it grew in recent PRs. That package will sooner or later
collapse for our tooling, so we may have to split it. It's already too
big.
## Future Actions
- Resolve React compiler violations, as mentioned in commit 3.
- Decouple the `tsconfig` used for typechecking and build across the
entire monorepo (as explained in point 5) to ensure ts coverage even for
files that aren't transpiled (such as scripts).
- Remove the few remaining `eslint.config.js`. I had to leave the
`richtext-lexical` and `next` ones for now. They could be moved to the
root config and scoped to their packages, as we do for example with
`templates/vercel-postgres/**`. However, I couldn't get it to work, I
don't know why.
- Make eslint in the test folder usable. Not only are we not linting
`test` in CI, but now the `pnpm eslint .` command is so large that my
computer freezes. If each suite were its own package, this would be
solved, and dynamic codegen + git hooks to modify tsconfig.base.json
wouldn't be necessary
([related](https://github.com/payloadcms/payload/pull/11984)).
Fixes sorting by fields in relationships, e.g `sort: "author.name"` when
using `draft: true`. The existing test that includes check with `draft:
true` was accidentally passing because it used to sort by the
relationship field itself.
When doing `payload.db.queryDrafts` with `select` without `version`, or
simply your select looks like:
`select: { version: { nonExistingField: true } }` - the `queryDrafts`
function will crash because it tries to access the `version` field.
This PR adds a fallback.
### What?
This PR adds support for `where` querying by the join field (don't
confuse with `where` querying of related docs via `joins.where`)
Previously, this didn't work:
```
const categories = await payload.find({
collection: 'categories',
where: { 'relatedPosts.title': { equals: 'my-title' } },
})
```
### Why?
This is crucial for bi-directional relationships, can be used for access
control.
### How?
Implements `where` handling for join fields the same as we do for
relationships. In MongoDB it's not as efficient as it can be, the old PR
that improves it and can be updated later is here
https://github.com/payloadcms/payload/pull/8858
Fixes https://github.com/payloadcms/payload/discussions/9683
Previously, jobs were executed in FIFO order on MongoDB, and LIFO on
Postgres, with no way to configure this behavior.
This PR makes FIFO the default on both MongoDB and Postgres and
introduces the following new options to configure the processing order
globally or on a queue-by-queue basis:
- a `processingOrder` property to the jobs config
- a `processingOrder` argument to `payload.jobs.run()` to override
what's set in the jobs config
It also adds a new `sequential` option to `payload.jobs.run()`, which
can be useful for debugging.
This adds support for running multiple job queue tasks in parallel
within the same workflow while preventing conflicts. Previously, this
would have caused the following issues:
- Job log entries get lost - the final job log is incomplete, despite
all tasks having been executed
- Write conflicts in postgres, leading to unique constraint violation
errors
The solution involves handling job log data updates in a way that avoids
overwriting, and ensuring the final update reflects the latest job log
data. Each job log entry now initializes its own ID, so a given job log
entry’s ID remains the same across multiple, parallel task executions.
## Postgres
In Postgres, we need to enable transactions for the
`payload.db.updateJobs` operation; otherwise, two tasks updating the
same job in parallel can conflict. This happens because Postgres handles
array rows by deleting them all, then re-inserting (rather than
upserting). The rows are stored in a separate table, and the following
scenario can occur:
Op 1: deletes all job log rows
Op 2: deletes all job log rows
Op 1: inserts 200 job log rows
Op 2: insert the same 200 job log rows again => `error: “duplicate key
value violates unique constraint "payload_jobs_log_pkey”`
Because transactions were not used, the rows inserted by Op 1
immediately became visible to Op 2, causing the conflict. Enabling
transactions fixes this. In theory, it can still happen if Op 1 commits
before Op 2 starts inserting (due to the read committed isolation
level), but it should occur far less frequently.
Alongside this change, we should consider inserting the rows using an
upsert (update on conflict), which will get rid of this error
completely. That way, if the insertion of Op 1 is visible to Op 2, Op 2
will simply overwrite it, rather than erroring. Individual job entries
are immutable and job entries cannot be deleted, thus this shouldn't
corrupt any data.
## Mongo
In Mongo, the issue is addressed by ensuring that log row deletions
caused due to different log states in concurrent operations are not
merged back to the client job log, and by making sure the final update
includes all job logs.
There is no duplicate key error in Mongo because the array log resides
in the same document and duplicates are simply upserted. We cannot use
transactions in Mongo, as it appears to lock the document in a way that
prevents reliable parallel updates, leading to:
`MongoServerError: WriteConflict error: this operation conflicted with
another operation. Please retry your operation or multi-document
transaction`
You can access the database name from `sanitizedConfig.db.name`. But
currently, it' not possible to access the db name from the unsanitized
config.
Plugins only have access to the unsanitized config. This change allows
db adapters to return the db name early, which will allow plugins to
conditionally initialize db-specific functionality
Continuation of #11489. This adds a new, optional `updateJobs` db
adapter method that reduces the amount of database calls for the jobs
queue.
## MongoDB
### Previous: running a set of 50 queued jobs
- 1x db.find (= 1x `Model.paginate`)
- 50x db.updateOne (= 50x `Model.findOneAndUpdate`)
### Now: running a set of 50 queued jobs
- 1x db.updateJobs (= 1x `Model.find` and 1x `Model.updateMany`)
**=> 51 db round trips before, 2 db round trips after**
### Previous: upon task completion
- 1x db.find (= 1x `Model.paginate`)
- 1x db.updateOne (= 1x `Model.findOneAndUpdate`)
### Now: upon task completion
- 1x db.updateJobs (= 1x `Model.findOneAndUpdate`)
**=> 2 db round trips before, 1 db round trip after**
## Drizzle (e.g. Postgres)
### running a set of 50 queued jobs
- 1x db.query[tablename].findMany
- 50x db.select
- 50x upsertRow
This is unaffected by this PR and will be addressed in a future PR