Skip to content

v0.17.0 dbt_zendesk

Latest
Compare
Choose a tag to compare
@fivetran-data-model-bot fivetran-data-model-bot released this 04 Sep 20:57
0b73c19

New model (#161)

  • Addition of the zendesk__document model, designed to structure Zendesk textual data for vectorization and integration into NLP workflows. The model outputs a table with:
    • document_id: Corresponding to the ticket_id
    • chunk_index: For text segmentation
    • chunk: The text chunk itself
    • chunk_tokens_approximate: Approximate token count for each segment
  • This model is currently disabled by default. You may enable it by setting the zendesk__unstructured_enabled variable as true in your dbt_project.yml.
    • This model was developed with the limit of chunk sizes to approximately 5000 tokens for use with OpenAI, however you can change this limit by setting the variable zendesk_max_tokens in your dbt_project.yml.
    • See the README section Enabling the unstructured document model for NLP for more information.

Breaking Changes (Full refresh required after upgrading)

  • Incremental models running on BigQuery have had the partition_by logic adjusted to include a granularity of a month. This change only impacts BigQuery warehouses and was applied to avoid the common too many partitions error some users have experienced when partitioning by day. Therefore, adjusting the partition to a month granularity will decrease the number of partitions created and allow for more performant querying and incremental loads. This change was applied to the following models (#165):

    • int_zendesk__field_calendar_spine
    • int_zendesk__field_history_pivot
    • zendesk__ticket_field_history
  • In the dbt_zendesk_source v0.12.0 release, the field _fivetran_deleted was added to the following models for use in zendesk__document model (#161):

    • stg_zendesk__ticket
    • stg_zendesk__ticket_comment
    • stg_zendesk__user
    • If you have already added _fivetran_deleted as a passthrough column via the zendesk__ticket_passthrough_columns or zendesk__user_passthrough_columns variable, you will need to remove or alias this field from the variable to avoid duplicate column errors.

Bug Fixes

  • Fixed an issue in the zendesk__sla_policies model where tickets that were opened and solved outside of scheduled hours were not being reported, specifically for the metrics requester_wait_time and agent_work_time.
    • Resolved by adjusting the join logic in models int_zendesk__agent_work_time_business_hours and int_zendesk__requester_wait_time_business_hours. (#164, #156)
  • Fixed an issue in the zendesk__ticket_metrics model where certain tickets had miscalculated metrics.
    • Resolved by adjusting the join logic in models int_zendesk__ticket_work_time_business, int_zendesk__ticket_first_resolution_time_business, and int_zendesk__ticket_full_resolution_time_business. (#167)

Under the hood

  • Added integrity validations:
    • Test to ensure zendesk__sla_policies and zendesk__ticket_metrics models produce consistent time results. (#164)
    • Test to ensure zendesk__ticket_metrics contains all the tickets found in stg_zendesk__ticket. (#167)
  • Modified the consistency_sla_policy_count validation test to group by ticket_id for more accurate testing. (#165)
  • Updated casting in joins from timestamps to dates so that the whole day is considered. This produces more accurate results. (#164, #156, #167)
  • Reduced the weeks looking ahead from 208 to 52 to improve performance, as tracking ticket SLAs beyond one year was unnecessary. (#156, #167)
  • Updated seed files to reflect a real world ticket field history update scenario. (#165)

Full Changelog: v0.16.0...v0.17.0