payloadcms

Author	SHA1	Message	Date
Alessio Gravili	c08b2aea89	feat: scheduling jobs (#12863 ) Adds a new `schedule` property to workflow and task configs that can be used to have Payload automatically _queue_ jobs following a certain _schedule_. Docs: https://payloadcms.com/docs/dynamic/jobs-queue/schedules?branch=feat/schedule-jobs ## API Example ```ts export default buildConfig({ // ... jobs: { // ... scheduler: 'manual', // Or `cron` if you're not using serverless. If `manual` is used, then user needs to set up running /api/payload-jobs/handleSchedules or payload.jobs.handleSchedules in regular intervals tasks: [ { schedule: [ { cron: '* * * * * *', queue: 'autorunSecond', // Hooks are optional hooks: { // Not an array, as providing and calling `defaultBeforeSchedule` would be more error-prone if this was an array beforeSchedule: async (args) => { // Handles verifying that there are no jobs already scheduled or processing. // You can override this behavior by not calling defaultBeforeSchedule, e.g. if you wanted // to allow a maximum of 3 scheduled jobs in the queue instead of 1, or add any additional conditions const result = await args.defaultBeforeSchedule(args) return { ...result, input: { message: 'This task runs every second', }, } }, afterSchedule: async (args) => { await args.defaultAfterSchedule(args) // Handles updating the payload-jobs-stats global args.req.payload.logger.info( 'EverySecond task scheduled: ' + (args.status === 'success' ? args.job.id : 'skipped or failed to schedule'), ) }, }, }, ], slug: 'EverySecond', inputSchema: [ { name: 'message', type: 'text', required: true, }, ], handler: ({ input, req }) => { req.payload.logger.info(input.message) return { output: {}, } }, } ] } }) ``` --- - To see the specific tasks where the Asana app for GitHub is being used, see below: - https://app.asana.com/0/0/1210495300843759	2025-07-18 06:48:27 -04:00
Alessio Gravili	59f536c2c9	refactor: simplify job queue error handling (#12845 ) This simplifies workflow / task error handling, as well as cancelling jobs. Previously, we were handling errors when they occur and passing through error state using a `state` object - errors were then handled in multiple areas of the code. This PR adds new, clean `TaskError`, `WorkflowError` and `JobCancelledError` errors that are thrown when they occur and are handled in one single place, massively cleaning up complex functions like [payload/src/queues/operations/runJobs/runJob/getRunTaskFunction.ts](https://github.com/payloadcms/payload/compare/refactor/jobs-errors?expand=1#diff-53dc7ccb7c8e023c9ba63fdd2e78c32ad0be606a2c64a3512abad87893f5fd21) Performance will also be positively improved by this change - previously, as task / workflow failure or cancellation would have resulted in multiple, separate `updateJob` db calls, as data modifications to the job object required for storing failure state were done multiple times in multiple areas of the codebase. Most notably, task error state was handled and updated separately from workflow error state. Now, it's just a clean, single `updateJob` call This PR also does the following: - adds a new test for `deleteJobOnComplete` behavior - cleans up test suite - ensures `deleteJobOnComplete` does not delete definitively failed jobs --- - To see the specific tasks where the Asana app for GitHub is being used, see below: - https://app.asana.com/0/0/1210553277813320	2025-06-17 22:24:53 +00:00
Alessio Gravili	84cb2b5819	refactor: simplify job type (#12816 ) Previously, there were multiple ways to type a running job: - `GeneratedTypes['payload-jobs']` - only works in an installed project - is `any` in monorepo - `BaseJob` - works everywhere, but does not incorporate generated types which may include type for custom fields added to the jobs collection - `RunningJob<>` - more accurate version of `BaseJob`, but same problem This PR deprecated all those types in favor of a new `Job` type. Benefits: - Works in both monorepo and installed projects. If no generated types exist, it will automatically fall back to `BaseJob` - Comes with an optional generic that can be used to narrow down `job.input` based on the task / workflow slug. No need to use a separate type helper like `RunningJob<>` With this new type, I was able to replace every usage of `GeneratedTypes['payload-jobs']`, `BaseJob` and `RunningJob<>` with the simple `Job` type. Additionally, this PR simplifies some of the logic used to run jobs	2025-06-16 16:15:56 -04:00
Alessio Gravili	7c05c775cb	docs: improve jobs autorun docs, adds e2e test (#12196 ) This clarifies that jobs.autoRun only runs already-queued jobs. It does not queue the jobs for you. Also adds an e2e test as this functionality had no e2e coverage	2025-06-05 09:19:19 -07:00
Alessio Gravili	545d870650	chore: fix various e2e test setup issues (#12670 ) I noticed a few issues when running e2e tests that will be resolved by this PR: - Most important: for some test suites (fields, fields-relationship, versions, queues, lexical), the database was cleared and seeded twice in between each test run. This is because the onInit function was running the clear and seed script, when it should only have been running the seed script. Clearing the database / the snapshot workflow is being done by the reInit endpoint, which then calls onInit to seed the actual data. - The slowest part of `clearAndSeedEverything` is recreating indexes on mongodb. This PR slightly improves performance here by: - Skipping this process for the built-in `['payload-migrations', 'payload-preferences', 'payload-locked-documents']` collections - Previously we were calling both `createIndexes` and `ensureIndexes`. This was unnecessary - `ensureIndexes` is a deprecated alias of `createIndexes`. This PR changes it to only call `createIndexes` - Makes the reinit endpoint accept GET requests instead of POST requests - this makes it easier to debug right in the browser - Some typescript fixes - Adds a `dev:memorydb` script to the package.json. For some reason, `dev` is super unreliable on mongodb locally when running e2e tests - it frequently fails during index creation. Using the memorydb fixes this issue, with the bonus of more closely resembling the CI environment - Previously, you were unable to run test suites using turbopack + postgres. This fixes it, by explicitly installing `pg` as devDependency in our monorepo - Fixes jest open handles warning	2025-06-04 17:34:37 -03:00
Jacob Fletcher	e87521a376	perf(ui): significantly optimize form state component rendering, up to 96% smaller and 75% faster (#11946 ) Significantly optimizes the component rendering strategy within the form state endpoint by precisely rendering only the fields that require it. This cuts down on server processing and network response sizes when invoking form state requests that manipulate array and block rows which contain server components, such as rich text fields, custom row labels, etc. (results listed below). Here's a breakdown of the issue: Previously, when manipulating array and block fields, _all_ rows would render any server components that might exist within them, including rich text fields. This means that subsequent changes to these fields would potentially _re-render_ those same components even if they don't require it. For example, if you have an array field with a rich text field within it, adding the first row would cause the rich text field to render, which is expected. However, when you add a second row, the rich text field within the first row would render again unnecessarily along with the new row. This is especially noticeable for fields with many rows, where every single row processes its server components and returns RSC data. And this does not only affect nested rich text fields, but any custom component defined on the field level, as these are handled in the same way. The reason this was necessary in the first place was to ensure that the server components receive the proper data when they are rendered, such as the row index and the row's data. Changing one of these rows could cause the server component to receive the wrong data if it was not freshly rendered. While this is still a requirement that rows receive up-to-date props, it is no longer necessary to render everything. Here's a breakdown of the actual fix: This change ensures that only the fields that are actually being manipulated will be rendered, rather than all rows. The existing rows will remain in memory on the client, while the newly rendered components will return from the server. For example, if you add a new row to an array field, only the new row will render its server components. To do this, we send the path of the field that is being manipulated to the server. The server can then use this path to determine for itself which fields have already been rendered and which ones need required rendering. ## Results The following results were gathered by booting up the `form-state` test suite and seeding 100 array rows, each containing a rich text field. To invoke a form state request, we navigate to a document within the "posts" collection, then add a new array row to the list. The result is then saved to the file system for comparison. \| Test Suite \| Collection \| Number of Rows \| Before \| After \| Percentage Change \| \|------\|------\|---------\|--------\|--------\|--------\| \| `form-state` \| `posts` \| 101 \| 1.9MB / 266ms \| 80KB / 70ms \| ~96% smaller / ~75% faster \| --------- Co-authored-by: James <james@trbl.design> Co-authored-by: Alessio Gravili <alessio@gravili.de>	2025-04-03 12:27:14 -04:00
Alessio Gravili	c844b4c848	feat: configurable job queue processing order (LIFO/FIFO), allow sequential execution of jobs (#11897 ) Previously, jobs were executed in FIFO order on MongoDB, and LIFO on Postgres, with no way to configure this behavior. This PR makes FIFO the default on both MongoDB and Postgres and introduces the following new options to configure the processing order globally or on a queue-by-queue basis: - a `processingOrder` property to the jobs config - a `processingOrder` argument to `payload.jobs.run()` to override what's set in the jobs config It also adds a new `sequential` option to `payload.jobs.run()`, which can be useful for debugging.	2025-03-31 15:00:36 -06:00
Alessio Gravili	9a1c3cf4cc	fix: support parallel job queue tasks (#11917 ) This adds support for running multiple job queue tasks in parallel within the same workflow while preventing conflicts. Previously, this would have caused the following issues: - Job log entries get lost - the final job log is incomplete, despite all tasks having been executed - Write conflicts in postgres, leading to unique constraint violation errors The solution involves handling job log data updates in a way that avoids overwriting, and ensuring the final update reflects the latest job log data. Each job log entry now initializes its own ID, so a given job log entry’s ID remains the same across multiple, parallel task executions. ## Postgres In Postgres, we need to enable transactions for the `payload.db.updateJobs` operation; otherwise, two tasks updating the same job in parallel can conflict. This happens because Postgres handles array rows by deleting them all, then re-inserting (rather than upserting). The rows are stored in a separate table, and the following scenario can occur: Op 1: deletes all job log rows Op 2: deletes all job log rows Op 1: inserts 200 job log rows Op 2: insert the same 200 job log rows again => `error: “duplicate key value violates unique constraint "payload_jobs_log_pkey”` Because transactions were not used, the rows inserted by Op 1 immediately became visible to Op 2, causing the conflict. Enabling transactions fixes this. In theory, it can still happen if Op 1 commits before Op 2 starts inserting (due to the read committed isolation level), but it should occur far less frequently. Alongside this change, we should consider inserting the rows using an upsert (update on conflict), which will get rid of this error completely. That way, if the insertion of Op 1 is visible to Op 2, Op 2 will simply overwrite it, rather than erroring. Individual job entries are immutable and job entries cannot be deleted, thus this shouldn't corrupt any data. ## Mongo In Mongo, the issue is addressed by ensuring that log row deletions caused due to different log states in concurrent operations are not merged back to the client job log, and by making sure the final update includes all job logs. There is no duplicate key error in Mongo because the array log resides in the same document and duplicates are simply upserted. We cannot use transactions in Mongo, as it appears to lock the document in a way that prevents reliable parallel updates, leading to: `MongoServerError: WriteConflict error: this operation conflicted with another operation. Please retry your operation or multi-document transaction`	2025-03-31 13:06:05 -06:00
Alessio Gravili	38131ed2c3	feat: ability to cancel jobs (#11409 ) This adds new `payload.jobs.cancel` and `payload.jobs.cancelByID` methods that allow you to cancel already-running jobs, or prevent queued jobs from running. While it's not possible to cancel a function mid-execution, this will stop job execution the next time the job makes a request to the db, which happens after every task.	2025-02-28 17:58:43 +00:00
Alessio Gravili	d53f166476	fix: ensure errors returned from tasks are properly logged (#11443 ) Fixes https://github.com/payloadcms/payload/issues/9767 We allow failing a job queue task by returning `{ state: 'failed' }` from the task, instead of throwing an error. However, previously, this threw an error when trying to update the task in the database. Additionally, it was not possible to customize the error message. This PR fixes that by letting you return `errorMessage` alongside `{ state: 'failed' }`, and by ensuring the error is transformed into proper json before saving it to the `error` column.	2025-02-28 16:00:56 +00:00
Alessio Gravili	c6ab312286	chore: cleanup queues test suite (#11410 ) This PR extracts each workflow of our queues test suite into its own file	2025-02-26 19:43:31 +00:00
Alessio Gravili	08fb159943	feat: allow running sub-tasks from tasks (#10373 ) Task handlers now receive `inlineTask` as an arg, which can be used to run inline sub-tasks. In the task log, those inline tasks will have a `parent` property that points to the parent task. Example: ```ts { slug: 'subTask', inputSchema: [ { name: 'message', type: 'text', required: true, }, ], handler: async ({ job, inlineTask }) => { await inlineTask('create two docs', { task: async ({ input, inlineTask }) => { const { newSimple } = await inlineTask('create doc 1', { task: async ({ req }) => { const newSimple = await req.payload.create({ collection: 'simple', req, data: { title: input.message, }, }) return { output: { newSimple, }, } }, }) const { newSimple2 } = await inlineTask('create doc 2', { task: async ({ req }) => { const newSimple2 = await req.payload.create({ collection: 'simple', req, data: { title: input.message, }, }) return { output: { newSimple2, }, } }, }) return { output: { simpleID1: newSimple.id, simpleID2: newSimple2.id, }, } }, input: { message: job.input.message, }, }) }, } as WorkflowConfig<'subTask'> ``` Job log example: ```ts [ { executedAt: '2025-01-06T03:55:44.682Z', completedAt: '2025-01-06T03:55:44.684Z', taskSlug: 'inline', taskID: 'create doc 1', output: { newSimple: [Object] }, parent: { taskSlug: 'inline', taskID: 'create two docs' }, // <= New state: 'succeeded', id: '677b5440ba35d345d1214d1b' }, { executedAt: '2025-01-06T03:55:44.690Z', completedAt: '2025-01-06T03:55:44.692Z', taskSlug: 'inline', taskID: 'create doc 2', output: { newSimple2: [Object] }, parent: { taskSlug: 'inline', taskID: 'create two docs' }, // <= New state: 'succeeded', id: '677b5440ba35d345d1214d1c' }, { executedAt: '2025-01-06T03:55:44.681Z', completedAt: '2025-01-06T03:55:44.697Z', taskSlug: 'inline', taskID: 'create two docs', input: { message: 'hello!' }, output: { simpleID1: '677b54401e34772cc63c8693', simpleID2: '677b54401e34772cc63c8697' }, parent: {}, state: 'succeeded', id: '677b5440ba35d345d1214d1d' } ] ```	2025-01-07 17:24:00 +00:00
Alessio Gravili	a89d54454a	fix: ensure jobs do not retry indefinitely by default, fix undefined values in error messages (#9605 ) ## Fix default retries By default, if no `retries` property has been set, jobs / tasks should not be retried. This was not the case previously, as the `maxRetries` variable was `undefined`, causing jobs to retry endlessly. This PR sets them to `0` by default. Additionally, this fixes some undesirable behavior of the workflow retries property. Workflow retries now act as maximum, workflow-level retries. Only tasks that do not have a retry property set will inherit the workflow-level retries. ## Fix error messages Previously, you were able to encounter error messages with undefined values like these: ![CleanShot 2024-11-28 at 15 23 37@2x](https://github.com/user-attachments/assets/81617ca8-11de-4d35-b9bf-cc6c5bc515be) Reason is that it was always using `job.workflowSlug` for the error messages. However, if you queue a task directly, without a workflow, `job.workflowSlug` is undefined and `job.taskSlug` should be used instead. This PR then gets rid of the second undefined value by ensuring that `maxRetries´ is never undefined	2024-12-02 22:05:48 +00:00
James Mikrut	8970c6b3a6	feat: adds jobs queue (#8228 ) Adds a jobs queue to Payload. - [x] Docs, w/ examples for Vercel Cron, additional services - [x] Type the `job` using GeneratedTypes in `JobRunnerArgs` (@AlessioGr) - [x] Write the `runJobs` function - [x] Allow for some type of `payload.runTask` - [x] Open up a new bin script for running jobs - [x] Determine strategy for runner endpoint to either await jobs successfully or return early and stay open until job work completes (serverless ramifications here) - [x] Allow for job runner to accept how many jobs to run in one invocation - [x] Make a Payload local API method for creating a new job easily (payload.createJob) or similar which is strongly typed (@AlessioGr) - [x] Make `payload.runJobs` or similar (@AlessioGr) - [x] Write tests for retrying up to max retries for a given step - [x] Write tests for dynamic import of a runner The shape of the config should permit the definition of steps separate from the job workflows themselves. ```js const config = { // Not sure if we need this property anymore queues: { }, // A job is an instance of a workflow, stored in DB // and triggered by something at some point jobs: { // Be able to override the jobs collection collectionOverrides: () => {}, // Workflows are groups of tasks that handle // the flow from task to task. // When defined on the config, they are considered as predefined workflows // BUT - in the future, we'll allow for UI-based workflow definition as well. workflows: [ { slug: 'job-name', // Temporary name for this // should be able to pass function // or path to it for Node to dynamically import controlFlowInJS: '/my-runner.js', // Temporary name as well // should be able to eventually define workflows // in UI (meaning they need to be serialized in JSON) // Should not be able to define both control flows controlFlowInJSON: [ { task: 'myTask', next: { // etc } } ], // Workflows take input // which are a group of fields input: [ { name: 'post', type: 'relationship', relationTo: 'posts', maxDepth: 0, required: true, }, { name: 'message', type: 'text', required: true, }, ], }, ], // Tasks are defined separately as isolated functions // that can be retried on fail tasks: [ { slug: 'myTask', retries: 2, // Each task takes input // Used to auto-type the task func args input: [ { name: 'post', type: 'relationship', relationTo: 'posts', maxDepth: 0, required: true, }, { name: 'message', type: 'text', required: true, }, ], // Each task takes output // Used to auto-type the function signature output: [ { name: 'success', type: 'checkbox', } ], onSuccess: () => {}, onFail: () => {}, run: myRunner, }, ] } } ``` ### `payload.createJob` This function should allow for the creation of jobs based on either a workflow (group of tasks) or an individual task. To create a job using a workflow: ```js const job = await payload.createJob({ // Accept the `name` of a workflow so we can match to either a // code-based workflow OR a workflow defined in the DB // Should auto-type the input workflowName: 'myWorkflow', input: { // typed to the args of the workflow by name } }) ``` To create a job using a task: ```js const job = await payload.createJob({ // Accept the `name` of a task task: 'myTask', input: { // typed to the args of the task by name } }) ``` --------- Co-authored-by: Alessio Gravili <alessio@gravili.de> Co-authored-by: Dan Ribbens <dan.ribbens@gmail.com>	2024-10-30 17:56:50 +00:00

14 Commits