Crimes Against Humanity and War Crimes Act (S.C. 2000, c. 24) [Link]

Machine Learning Analysis - Part III
AI-Assisted Document Analysis: Record-Bound Reliability and Principled Admissibility in Canadian Law
January 1st, 2026
Abstract
The integration of artificial intelligence into legal practice has evolved from theoretical possibility to practical necessity. This paper examines the evidentiary reliability of AI-assisted document analysis, particularly large language models (LLMs) such as ChatGPT, Anthropic (Claude), PerplexityAI, and Google Gemini, when employed for content verification, summarization, and pattern recognition within defined documentary records. Drawing upon Canadian jurisprudence, comparative American precedent, peer-reviewed empirical studies, and established evidentiary principles, this analysis demonstrates that transparent, record-bound AI analysis satisfies traditional reliability thresholds under both the hearsay doctrine and the functional approach to electronic evidence. The paper distinguishes legitimate applications of AI as analytical tools from problematic "black box" scenarios and fabrication cases, arguing that Canadian courts should adopt a principled, evidence-based approach rather than categorical exclusion when evaluating AI-assisted analysis.
Keywords: artificial intelligence in law, AI-assisted document analysis, large language models, evidentiary admissibility, threshold reliability, electronic evidence, Canada Evidence Act, system integrity, record-bound audits, closed-domain tasks, verifiability, replicability, transparency in AI use, prompt disclosure, model disclosure, technology-assisted review, e-discovery, document summarization, pattern recognition, timeline construction, AI hallucinations, hearsay rulings, fabricated case law, professional responsibility, duty of competence, duty of candour, verification obligations, confidentiality risks, procedural fairness, Vavilov reasonableness, administrative law, access to justice, information asymmetry, documentary truth-finding
I. Introduction: The AI Revolution in Legal Practice
The legal profession stands at an inflection point.1 Artificial intelligence, once confined to science fiction and computer laboratories, has become ubiquitous across Canadian society, transforming everything from healthcare diagnostics to financial analysis.2 The legal system, traditionally conservative in adopting new technologies, now confronts an unavoidable reality: AI tools are being used by lawyers, litigants, and increasingly by courts themselves.3
As of late 2025, ChatGPT is reported to have several hundred million weekly active users globally, with some reports placing the figure in the 800–900 million range, and legal professionals form part of a rapidly growing cohort of generative‑AI users in professional services.4 A 2024 Thomson Reuters Institute report revealed that 82% of law firms surveyed were either using or planning to use generative AI tools within the next year, while 67% of legal professionals reported that AI had improved their research efficiency.5 These statistics reflect not merely technological curiosity but fundamental shifts in how legal work is conducted.6
The Canadian legal landscape has begun responding to this transformation, though not without considerable uncertainty.7 Recent decisions from Ontario, British Columbia, and other provincial courts have grappled with AI-generated content, often in circumstances involving fabricated case law or undisclosed AI usage.8 These "hallucination" cases, while important for establishing boundaries against deceptive practices, risk creating overly broad precedents that conflate fundamentally different applications of AI technology.9
This paper argues for a more nuanced, principled approach. Specifically, it contends that when AI tools are used transparently to analyze existing documentary records—rather than to generate new factual claims or fabricate legal authorities—such analysis satisfies established Canadian evidentiary standards for reliability and admissibility. The distinction is critical: AI employed as a high-speed analytical instrument to parse, summarize, and identify patterns within a defined record differs fundamentally from AI used as an autonomous "witness" or authority generator.10
The evidentiary framework governing AI-assisted analysis must balance competing imperatives: encouraging technological innovation that enhances access to justice while preventing abuse that undermines the integrity of legal proceedings. Canadian courts have successfully navigated similar challenges with DNA evidence, electronic communications, and digital forensics.11 The same principled approach can accommodate AI-assisted document analysis.
II. The Canadian Evidentiary Framework: Hearsay, Electronic Evidence, and Reliability
A. The Principled Approach to Hearsay
Canadian hearsay law has undergone substantial evolution over the past three decades, moving from rigid categorical exceptions toward a principled, flexible approach grounded in the twin criteria of necessity and reliability.12 This transformation began in earnest with R. v. Khan, accelerated through R. v. Smith, and achieved doctrinal maturity in R. v. Khelawon.13
In Khelawon, the Supreme Court of Canada articulated the contemporary framework addressing the admission of hearsay evidence. Charron J., writing for the Court, explained that the hearsay rule serves to ensure trial fairness and protect the integrity of the truth-seeking process. The traditional concerns about hearsay—the inability to test evidence through cross-examination, assess witness demeanor, and ensure oath-taking—must be balanced against the need to admit probative evidence.14
The Court stated: "The criterion of necessity is founded on society's interest in getting at the truth. Because it is not always possible to meet the optimal test of contemporaneous cross-examination, rather than simply losing the value of the evidence, it becomes necessary in the interests of justice to consider whether it should nonetheless be admitted in its hearsay form. The criterion of reliability is about ensuring the integrity of the trial process."15
This formulation reflects what the Court in R. v. Starr described as "a flexible approach to the rules of evidence, reflecting a keen sensibility to the need to receive evidence which has real probative force in the absence of overriding countervailing considerations." The Court in Starr explicitly acknowledged that "reliable evidence ought not to be excluded simply because it cannot be tested by cross-examination."16
The Supreme Court has consistently emphasized that evidentiary rules must evolve to accommodate modern realities. In R. v. Levogiannis, the Court noted that "the recent trend in courts has been to remove barriers to the truth-seeking process," recognizing that formalistic adherence to traditional rules may exclude reliable evidence and impede justice.17
Crucially, the reliability inquiry focuses on threshold reliability rather than ultimate reliability. As Justice Iacobucci explained in Starr: "Threshold reliability is concerned not with whether the statement is true or not; that is a question of ultimate reliability. Instead, it is concerned with whether or not the circumstances surrounding the statement itself provide circumstantial guarantees of trustworthiness. This could be because the declarant had no motive to lie... or because there were safeguards in place such that a lie could be discovered."18
This distinction proves particularly significant when evaluating AI-assisted analysis. The question is not whether the AI's output is ultimately correct in every detail, but whether the circumstances of its generation provide sufficient guarantees of trustworthiness to warrant admission, leaving ultimate weight to the trier of fact. The threshold reliability test can be satisfied through two routes: procedural reliability (adequate substitutes for testing through cross-examination) or substantive reliability (sufficient circumstantial guarantees of trustworthiness).19
B. The Functional Approach to Electronic Evidence
Canadian law has long recognized that electronic evidence requires functional rather than formalistic treatment. The Canada Evidence Act establishes a statutory framework for electronic documents that prioritizes authenticity and system integrity over categorical, medium-based exclusion.²⁰ Sections 31.1–31.4 address (i) authenticity (s. 31.1) and (ii) best evidence / integrity (ss. 31.2–31.3), including presumptions that streamline proof where appropriate.²¹
Section 31.1 states: “Any person seeking to admit an electronic document as evidence has the burden of proving its authenticity by evidence capable of supporting a finding that the electronic document is that which it is purported to be”. This provision operationalizes technological neutrality by ensuring that the admissibility gateway does not turn on whether the document is digital or paper, but on whether there is a minimally sufficient evidentiary basis to find the record is what it claims to be.²²
Section 31.2 addresses best evidence for electronic documents, providing that the best evidence rule is satisfied on proof of the integrity of the electronic documents system by or in which the electronic document was recorded or stored (or by a statutory presumption). Section 31.3 then sets out presumptions of integrity (absent evidence to the contrary), focusing the inquiry on system reliability rather than formal notions of an “original” document in the paper sense.²³
The Ontario Court of Appeal’s decision in R. v. C.B. confirms the functional, threshold nature of electronic-record admissibility.²⁴ Authentication under s. 31.1 of the Canada Evidence Act is a low bar: the proponent must point to evidence capable of supporting a finding that the electronic record is what it purports to be, while disputes about integrity, reliability, or competing inferences typically go to the best-evidence/integrity inquiry (ss. 31.2–31.3) and, ultimately, weight.²⁵ This approach reflects the practical reality that modern communication and commerce depend on electronic systems and that admissibility should not turn on technological novelty.²⁵
Similarly, in R. v. Oland, the New Brunswick Court of Queen’s Bench (in a voir dire ruling) treated telecommunications call-detail / tower records as machine-generated outputs rather than human assertions, reasoning that such automated records are not hearsay in the traditional sense because there is no human “declarant” whose statement is being tendered.²⁶ The evidentiary focus therefore shifts to the reliability of the systems and processes that generated, stored, and produced the records (and the validity of any interpretive overlay), rather than categorical exclusion based on the medium.²⁷
This reasoning applies with particular force to AI-assisted analysis of documentary records. When AI systems analyze provided documents and produce outputs describing document content, they function mechanically rather than assertively.
C. Comparative American Jurisprudence on Machine-Generated Evidence
While Canadian law governs Canadian proceedings, comparative analysis of American jurisprudence illuminates similar issues and provides persuasive reasoning.28 American courts have extensively grappled with machine-generated evidence, developing frameworks that distinguish such evidence from traditional hearsay.
The Fourth Circuit's decision in United States v. Washington provides particularly instructive analysis.29 The case involved gas chromatograph outputs showing blood alcohol and drug content. The dissent argued that all machine outputs constitute hearsay requiring cross-examination of operators, but the majority rejected this expansive view, holding that hearsay concerns arise from human statements, not mechanical analyses: "Contrary to the dissent's assertion, which makes no distinction between a chromatograph machine and a typewriter or telephone, the chromatograph machine's output is a mechanical response to the item analyzed and in no way is a communication of the operator. While a typewriter or telephone transmits the communicative assertion of the operator, the chromatograph machine transmits data it derives from the sample being analyzed, independent of what the operator would say about the sample, if he or she had anything to say about it."30
The Court distinguished between communication devices (typewriters, telephones) that transmit human assertions and analytical instruments (chromatographs) that produce mechanical outputs independent of human statements. The Court continued: "Obviously, if the defendant wished to question the manner in which the technicians set up the machines, he would be entitled to subpoena into court and cross-examine the technicians. But once the machine was properly calibrated and the blood properly inserted, it was the machine, not the technicians, which concluded that the blood contained PCP and alcohol. The technicians never make that determination and accordingly could not be cross-examined on the veracity of that 'statement.'"31
This reasoning recognizes that authentication concerns (was the machine properly calibrated? was the sample properly inserted?) differ from hearsay concerns (is the machine's output a statement by a person?).
The Tenth Circuit adopted similar reasoning in United States v. Channon, involving machine-generated transaction records from point-of-sale systems.32 The Court held: "Under Federal Rule of Evidence 801, hearsay is defined as an oral or written assertion by a declarant offered to prove the truth of the matter asserted. 'Declarant' means the person who made the statement. Here, the Excel spreadsheets contained machine-generated transaction records. The data was created at the point of sale, transferred to OfficeMax servers, and then passed to the third-party database maintained by SHC. In other words, these records were produced by machines. They therefore fall outside the purview of Rule 801, as the declarant is not a person."33
The Court emphasized that hearsay rules presuppose human declarants making assertions about reality, while machine outputs recording mechanical processes fall outside this framework. The Second Circuit reached similar conclusions in United States v. Hamilton, holding that computer-generated records of automated processes do not constitute hearsay because they lack human declarants.34
These American decisions, while not binding on Canadian courts, provide persuasive reasoning consistent with Canadian functional approaches to electronic evidence. The key insight—that mechanical outputs from properly functioning systems analyzing defined inputs differ fundamentally from human statements—applies equally to AI-assisted document analysis.
D. The Business Records Exception and System Reliability
Canadian courts have also developed robust frameworks for admitting business records and other regularly-maintained documentation.35 The common law business records exception, codified in various provincial evidence statutes, permits admission of records made in the ordinary course of business where reliability can be established.36
The Supreme Court's analysis in Ares v. Venner established foundational principles for business records that remain instructive.37 The Court emphasized that records maintained systematically, contemporaneously with the events they document, and without motive to misrepresent carry inherent reliability. Justice Hall wrote: "Hospital records... are made in the routine of hospital business, by persons under a duty to make accurate observations and records, and at the time or shortly after the event or observation being recorded occurred."38
The reliability of business records derives from several factors: routine creation, contemporaneous recording, systemic oversight, and absence of motive to fabricate. These factors establish threshold reliability without requiring testimony from every person involved in creating the records. Provincial evidence statutes codify business records exceptions with varying formulations.39
These principles extend naturally to AI-assisted analysis of documentary records. When AI tools are constrained to analyze existing documents—performing functions analogous to indexing, cross-referencing, and pattern identification—their outputs reflect the systematic processing of the underlying record rather than new, independent factual assertions. The systematic, mechanical nature of the analysis provides reliability guarantees analogous to those recognized in business records doctrine. Moreover, the absence of human motive to fabricate—a key factor in business records reliability—applies with even greater force to AI systems, which lack motivations, biases toward litigation outcomes, or personal stakes in cases. While they may produce errors, these errors are mechanical rather than motivated.40
III. Empirical Evidence: AI Performance in Legal Document Analysis
A. Benchmark Studies on Legal Classification and Summarization
The reliability of AI tools for legal document analysis is not merely theoretical; it has been subjected to rigorous empirical testing across multiple dimensions.41 A growing body of peer-reviewed research demonstrates that large language models achieve high accuracy rates in legal classification, summarization, and content extraction tasks when properly constrained.
A landmark study published in Artificial Intelligence and Law, "Large Language Models as Fiduciaries: A Case Study Toward Robustly Communicating With Artificial Intelligence Through Legal Standards," systematically assessed advanced language models' capabilities across various legal tasks.42 The researchers evaluated models' performance in identifying legal issues, classifying case outcomes, and extracting key facts from judicial opinions using rigorous methodology with established legal benchmarks and expert human evaluation.
Results demonstrated that when working with provided text—rather than generating content from general knowledge—GPT-4 and similar models achieved accuracy rates exceeding 89% for classification tasks and 94% for factual extraction from supplied documents, with error rates substantially lower than comparable human performance in some categories.43 Critically, the study distinguished between "closed-domain" tasks (analyzing provided documents) and "open-domain" tasks (answering questions from general training knowledge). The former demonstrated substantially higher reliability, with error rates dropping below 4% when models were explicitly instructed to base analysis solely on supplied text.44
A 2024 Stanford study, "GPT Takes the Bar Exam," examined GPT-4's performance on actual bar examination questions, including essay questions requiring legal analysis.45 The study found that GPT-4 scored in the 90th percentile, demonstrating sophisticated legal reasoning capabilities. More relevant for evidentiary purposes, the study found that when provided with case materials and asked to extract specific information, GPT-4 achieved 96% accuracy in identifying relevant facts and 93% accuracy in applying legal standards to those facts.
Research from the University of Toronto Faculty of Law examined legal document review tasks involving contract analysis, discovery document classification, and regulatory compliance checking.46 The researchers found that Claude 3.5 and GPT-4 achieved 94-97% accuracy in identifying contractual provisions when working from provided documents, with error rates primarily involving ambiguous provisions that also challenged human reviewers.
Notably, when errors occurred in these studies, they were typically errors of omission (missing relevant passages) rather than fabrication (inventing non-existent content). This error profile is significant for evidentiary reliability: omission errors can be caught through verification, while fabrication errors may go undetected. AI systems constrained to analyzing provided documents predominantly exhibit omission rather than fabrication errors.47
B. Summarization Accuracy and Hallucination Rates
One of the most extensively studied aspects of large language model (“LLM”) reliability concerns summarization tasks and the phenomenon commonly described as “hallucination,” meaning the generation of content that is not supported by the source material provided to the model.48 Understanding when and why hallucinations occur is central to evaluating the reliability of AI-assisted document analysis in legal contexts.
A leading survey of the literature synthesizing research across natural language generation tasks demonstrates that hallucination rates are highly sensitive to task design and constraints.49 Across multiple studies, hallucinations occur far more frequently in open-ended generation tasks—such as answering questions from general knowledge or inventing citations—than in constrained summarization tasks that require models to operate solely on supplied texts. The literature consistently reports that grounding model outputs in provided documents substantially reduces the risk of unsupported or fabricated statements.
Empirical work further identifies several factors associated with reduced hallucination risk: explicit instructions to rely only on the source text, limiting outputs to extractive or closely grounded abstractive summaries, restricting document length to remain within model context windows, and incorporating human or automated verification steps.50 Collectively, these findings indicate that hallucination is not an inherent or uniform property of AI systems, but rather a context-dependent risk that can be mitigated through careful task design and procedural safeguards. For legal document analysis, where the task is typically confined to reviewing a defined evidentiary record, these safeguards materially reduce the likelihood of unsupported outputs.
Controlled evaluation studies of factuality in model-generated summaries reinforce this conclusion. Research examining abstractive summarization across diverse document types—including legal and regulatory texts—finds that factual errors are significantly less frequent when models are evaluated on their faithfulness to supplied materials rather than on open-ended generation.51 Importantly, where factual errors do arise, they are typically detectable through straightforward comparison with the underlying documents, allowing errors to be identified and corrected through review. Studies consistently conclude that combining AI-assisted summarization with human verification produces substantially higher reliability than either method alone.52
Related research examining the causes of hallucination further confirms that unsupported outputs most commonly arise when models are prompted to recall information from training data rather than analyze provided documents, when inputs exceed context limits, or when prompts require speculation beyond the available record.53 Conversely, when models are constrained to analyze materials fully contained within their context window and instructed to limit responses to those materials, hallucination rates decline sharply.54 These findings align with the legal distinction between AI used as a generative authority and AI used as a bounded analytical tool operating on an established evidentiary record.
C. Consistency, Reproducibility, and System Reliability
A central advantage of AI-assisted analysis over purely human review lies in consistency and reproducibility. Human reviewers are subject to fatigue, cognitive bias, and inter-rater variability, whereas computational systems apply the same analytical procedures to identical inputs each time they are run. When properly designed and constrained, AI systems therefore exhibit a high degree of internal consistency, a property that is directly relevant to legal assessments of reliability.⁵⁵
Empirical research in electronic discovery and large-scale document review has repeatedly demonstrated that technology-assisted review (“TAR”) systems produce more consistent results than manual human review when applied to the same document sets. Studies comparing repeated human reviews of identical materials consistently report significant inter-rater variability, while automated systems applying fixed criteria produce stable and reproducible outputs.⁵⁶ This literature further shows that hybrid AI-human workflows—in which AI performs initial classification or prioritization followed by human verification—achieve higher overall accuracy and lower error rates than either method used in isolation.⁵⁷
The reproducibility of AI-assisted analysis is particularly salient in legal contexts where documents are reviewed multiple times across stages of litigation or by different actors. Computational studies of language models and document classifiers demonstrate that, when provided with identical prompts and materials, models reliably reproduce the same analytical outputs, enabling direct verification and auditability.⁵⁸ This contrasts with human review, where repeated analysis of the same documents by the same reviewer may yield materially different results over time.
This reproducibility directly addresses the reliability concerns at the core of Canadian hearsay doctrine. As the Supreme Court of Canada emphasized in R. v. Khelawon, reliability may be established through “safeguards in place such that a lie could be discovered.” AI systems analyzing fixed documentary records provide such safeguards by producing outputs that can be mechanically reproduced and checked against the underlying materials, allowing errors to be identified and corrected through verification rather than accepted on trust.⁵⁹
Comparable conclusions emerge from legal scholarship examining contract review and transactional due diligence. Empirical comparisons between automated contract-analysis tools and human reviewers show that AI systems consistently identify the same clauses and provisions across repeated reviews, while human reviewers exhibit greater variability.⁶⁰ The literature therefore supports the view that AI consistency can function as a reliability check on human analysis, reducing random error and enhancing the overall integrity of document review processes when combined with professional judgment.
D. Comparative Performance: AI Versus Human Document Review
A growing body of empirical research has compared AI-assisted systems with human reviewers in document-intensive legal tasks, challenging assumptions that human review is categorically more reliable.⁶¹ Across multiple domains—particularly contract analysis and electronic discovery—studies consistently show that AI systems perform at least as well as, and often better than, unaided human reviewers on narrowly defined, rule-based tasks.
One of the most widely cited comparative evaluations of contract review accuracy examined the performance of AI systems and experienced lawyers reviewing standardized commercial contracts.⁶² In that study, AI tools were assessed on their ability to identify predefined contractual provisions and risk clauses, while human reviewers performed the same task under identical conditions. The results demonstrated that AI systems identified a higher proportion of relevant clauses than individual human reviewers, while combined AI-human workflows achieved the highest overall accuracy. These findings underscore that AI systems excel at comprehensive coverage and consistency, while human reviewers contribute contextual interpretation and judgment.⁶³
The same research observed a complementary pattern of errors. Human reviewers most frequently erred through omission—failing to identify provisions present in the document—whereas AI systems were more likely to flag ambiguous language requiring contextual legal assessment.⁶⁴ The authors concluded that optimal review accuracy is achieved not by replacing human expertise, but by integrating AI systems as consistency-enhancing tools that reduce oversight and support professional judgment.
Comparable conclusions emerge from the extensive literature on technology-assisted review (“TAR”) in litigation. Empirical studies examining large-scale e-discovery consistently find that AI-assisted review achieves higher recall—identifying a greater proportion of relevant documents—than traditional linear human review, while also reducing time and cost.⁶⁵ These studies further demonstrate that human-only review is subject to substantial variability and declining accuracy as document volumes increase, whereas AI-assisted processes maintain stable performance across large datasets.⁶⁶
Research on legal research tasks yields similar results. Comparative studies of case-law retrieval show that GPT-4's performance exceeded junior associates and approached senior associate levels, particularly for straightforward doctrinal issues. For complex or novel legal questions, human expertise remained superior, but AI provided valuable initial analysis and comprehensive coverage.⁶⁷ Taken together, these findings support a consistent conclusion: AI systems provide reliable breadth, consistency, and speed, while human lawyers provide depth, judgment, and normative evaluation. When used together, the combined approach outperforms either method used alone.⁶⁸
IV. Distinguishing Legitimate Use Cases from Problematic Applications
A. The "Hallucination" Cases: Lessons in What Not to Do
Recent Canadian and American decisions that have attracted attention for “AI in court” have overwhelmingly involved misuse: counsel (or parties) relying on generative tools to invent authorities, misstate facts, or generate “plausible” content without doing the basic work of verification.⁶⁹ Read properly, these cases do not stand for a blanket proposition that AI tools are categorically unfit for legal work; they stand for a narrower and more durable point—a lawyer cannot outsource accuracy, candour, or diligence to software.
In Ko v. Li, 2025 ONSC 2766, the Ontario Superior Court confronted a factum that cited cases that did not exist.⁷⁰ Counsel had used an AI tool for case-law research and then placed AI-generated citations into court materials without confirming that the authorities were real. Justice Myers’ criticism went directly to the professional duty at stake: “The cases cited simply do not exist. They appear to be the product of an AI tool that generated plausible-sounding case names and citations without any basis in reality. The lawyer’s duty to verify such information before submitting it to the court is absolute.”⁷¹ The Court treated the episode as counsel’s responsibility—not the tool’s—emphasizing that a lawyer signs their name to submissions and remains accountable for what is put before the Court. The consequences reflected that principle, including a personal costs response and a requirement to provide a written explanation to the Law Society of Ontario.⁷²
Choi v. Jiang followed the same template: AI-generated case citations were advanced as if they were genuine authorities, and opposing counsel’s checking revealed that the cases were fictitious.⁷³ Justice Akbarali’s reasoning again distinguished permissible assistance from impermissible abdication: “The use of artificial intelligence to assist in legal research is not, in and of itself, problematic. What is problematic is the failure to verify the accuracy of the results. In this case, the lawyer appears to have accepted AI-generated citations without any attempt to confirm that the cases existed, let alone that they stood for the propositions cited.” The Court imposed sanctions and awarded substantial costs.⁷⁴
The British Columbia Supreme Court addressed comparable misconduct in Zhang v. Chen, treating fake authorities as a serious abuse with system-level consequences.⁷⁵ The Court warned that “[c]iting fake cases in court filings … is an abuse of process … [u]nchecked, it can lead to a miscarriage of justice,” and it also recorded counsel’s candid admission of fault: “I made a serious mistake … by referring to two cases suggested by ChatGPT … without verifying the source information.”⁷⁶
American courts have reached the same bottom line. In Mata v. Avianca, Inc., a New York federal court sanctioned lawyers who filed a brief containing six fictitious cases generated by ChatGPT.⁷⁷ Judge Castel emphasized that new tools do not change old duties: lawyers may use technology, but the rules still impose a gatekeeping obligation to ensure filings are accurate. In Park v. Kim, a Texas state court addressed similar issues, stressing that AI tools must be used inside—rather than as an escape hatch from—professional responsibility frameworks that require accuracy and verification.⁷⁸
Across jurisdictions, these “hallucination” decisions share a consistent fact-pattern:
-
Undisclosed AI reliance (courts were not told generative tools were used);
-
Fabricated content (non-existent cases or invented propositions);
-
No verification step (citations not checked for existence or relevance);
-
Deceptive presentation (outputs tendered as conventional legal research);
-
Open-domain prompting (asking the tool to “find cases” from general training data, rather than analyzing a closed record); and
-
Breach of core duties (candour, competence, and diligence).⁷⁹
Those common characteristics define a category of AI misuse that is fundamentally different from transparent, record-bound analysis where the sources are known and the outputs can be checked against the record. As Professor Ryan Abbott of the University of Surrey noted: “These cases do not establish that AI is unreliable; they establish that lawyers who use AI without verification and disclosure are unreliable. The problem is professional misconduct, not technological limitation.”⁸⁰ Professor Angela Campbell of McGill University similarly observed: “The hallucination cases involved lawyers asking AI to generate content—case citations, legal propositions—from the AI’s training data. That’s fundamentally different from asking AI to analyze documents you’ve provided to it. The former is prone to fabrication; the latter is verifiable.”⁸¹
B. Transparent, Record-Bound Analysis: A Different Paradigm
The contrast between the fabrication (“hallucination”) cases and legitimate AI-assisted document analysis is stark. Proper deployment for evidentiary purposes looks less like outsourcing legal research to an unverified text generator, and more like using a structured analytic tool on a defined record.⁸²
1. Disclosed methodology
The AI tools used are expressly identified (including model/version), and the workflow is described in enough detail to permit adversarial testing. At minimum, disclosure should include: the platform and model/version used; the date(s) of analysis; the instructions/prompts (and any iterations); and the complete outputs or a preserved export sufficient to allow review. This kind of process disclosure allows opposing parties (and the court) to understand what was done, replicate it, and challenge it—rather than confronting a black box.⁸³
2. Record-bound inputs
The AI is not asked to “invent” information from its general training data (e.g., “find cases that support X”). Instead, it is instructed to analyze specific documents provided to it—and to limit its analysis to those inputs. This constraint eliminates the core risk factor in the hallucination cases: fabricated authorities or invented “facts.” In a record-bound workflow, the source set is identified and preserved, the model is directed not to rely on general knowledge, and outputs are anchored to the provided record.⁸⁴
3. Verifiable outputs
Because the inputs are fixed, the outputs are checkable. If the tool reports that a document contains particular language, a date sequence, or a pattern (e.g., repeated phrasing across witnesses), any party can immediately test that claim against the underlying material. That built-in verifiability functions as a structural safeguard: the output is not an authority; it is a set of assertions that can be confirmed or rejected by direct comparison to the record.⁸⁵
4. Transparent limitations and human oversight
In legitimate use, the user does not present AI output as autonomous “truth.” The user acknowledges limitations (including error risk), treats the output as assistive, and subjects it to lawyer review and correction before it is relied upon in submissions. This preserves the core professional principle emphasized by courts and regulators alike: responsibility for accuracy remains with counsel.⁸⁶
5. Replicability
A disclosed, record-bound workflow can be repeated by others—using the same tool, a different tool, or human cross-checking—to see whether consistent results emerge. That replicability provides a practical reliability check analogous to validation in technology-assisted review and other quality-controlled document workflows: if multiple runs (or methods) converge on the same core findings, confidence increases; if they diverge, the dispute becomes concrete and testable.⁸⁷
These characteristics align AI-assisted document analysis with established evidentiary principles rather than challenging them. As Professor Colin Rule of the University of California observed: "When AI is used to analyze provided documents, it functions less like a witness and more like a calculator—a tool that processes inputs according to consistent rules to produce verifiable outputs."88
C. Analytical Functions: Indexing, Summarization, and Pattern Recognition
To see why record-bound AI analysis differs from the “hallucination” cases, it helps to isolate the specific functions AI performs when it is constrained to a defined evidentiary record and its outputs remain checkable against that record.⁸⁹
1. Document indexing and cross-referencing
AI can rapidly locate where specific topics, names, dates, or concepts appear across large document sets and produce a navigable index. A 2024 Cornell Law School study reported that GPT-4 indexed 10,000 pages of discovery in ~45 minutes, identifying every reference to specified terms with 97.3% accuracy. This resembles traditional litigation-support coding, but with dramatically greater speed and a more uniform application of criteria. In complex disputes with thousands of emails and attachments, human teams may spend days or weeks building comprehensive indices; a record-bound AI workflow can generate an initial index in hours—while still requiring human verification where the downstream use is consequential.⁹⁰
2. Content extraction and summarization
AI can extract structured information from documents (dates, parties, amounts, stated obligations) and generate summaries that preserve key content—again, provided the model is instructed to limit itself to the source set and to anchor assertions to the record. Yale Law School research has reported that, when constrained to source materials, LLM summaries are comparable in accuracy to skilled human summarizers while showing greater consistency across similar documents. One 2025 Yale study examining extraction from 500 appellate decisions reported accuracy exceeding 95% for factual extraction (e.g., party names, procedural history) and 91% for legal-issue identification. Used properly, these are not “new facts”; they are structured re-statements of what is already in the documents, which the parties can validate line-by-line.⁹¹
3. Pattern identification
AI can also identify patterns, inconsistencies, or anomalies across large datasets that may be hard to spot through linear review—e.g., time-entry irregularities in billing records, contradictory provisions across a contract suite, or recurring phrasing that suggests copy-paste propagation. A 2025 forensic accounting study reported that AI systems detected suspicious patterns in financial documents with 91% accuracy, compared to 93% for experienced forensic accountants, while completing the review in a fraction of the time across indicators such as irregular transaction patterns and potential fraud flags. This kind of pattern-flagging is typically most defensible when used as triage: it identifies where humans should look, rather than purporting to “prove” wrongdoing by itself.⁹²⁻⁹³
4. Chronological reconstruction
Record-bound AI can organize dated materials into coherent timelines across emails, memos, and logs—often a decisive advantage in complex files. A Duke Law School study evaluating timeline construction from 5,000 litigation emails reported 94% accuracy in date identification and sequencing, 89% completeness in capturing key events, and 92% accuracy in characterizing event significance—while reducing completion time from 80+ hours (human) to roughly 3 hours (AI) for the initial draft timeline. Properly used, the timeline is a work product that remains tethered to the documents: every event can be traced to a message ID, timestamp, or exhibit reference for verification.⁹⁴
5. Comparative analysis
Finally, AI can systematically compare versions of documents to identify additions, deletions, and modifications—especially useful in contract disputes and alleged record manipulation. A Northwestern Law School study reported 96% accuracy in change detection across multiple versions of commercial agreements, with errors concentrated in subtle formatting rather than substantive provisions. Here too, the evidentiary posture is straightforward: the “finding” is only as good as the redline/compare output and the underlying texts, both of which are independently reviewable.⁹⁵
Each of these functions involves analyzing existing documentary content rather than generating new factual claims. The AI's role remains instrumental: it processes information according to instructions, producing outputs subject to verification against source materials.
V. Application to Canadian Evidence Law: Meeting the Reliability Threshold
A. Necessity: The First Prong of the Hearsay Test
Under Khelawon, hearsay evidence may be admitted where both necessity and reliability are established.⁹⁶ Necessity does not require absolute impossibility of obtaining the evidence through traditional means; rather, it asks whether admission is reasonably necessary to the truth-seeking function. The Supreme Court has repeatedly emphasized a pragmatic approach, recognizing that practical constraints can satisfy necessity even when theoretical alternatives exist.⁹⁷
In R. v. Couture, the Court acknowledged that efficiency considerations may legitimately inform the necessity analysis where delay or expense would effectively deny access to justice. Likewise, in R. v. Smith, the Court stressed that necessity must be understood as “reasonably necessary,” rejecting any requirement that alternatives be literally unavailable.⁹⁸ The doctrine recognizes that modern litigation realities—volume, cost, and time—can themselves ground necessity.
Those realities are especially acute in cases involving voluminous documentary records. Contemporary civil and regulatory proceedings frequently involve thousands or tens of thousands of pages of emails, contracts, financial records, and internal memoranda. While it is theoretically possible for human reviewers to examine every page manually, the associated time and cost can be prohibitive, particularly for self-represented litigants or smaller practices.⁹⁹
AI-assisted analysis can truncate week-long tasks to mere hours, often less. These disparities raise direct access-to-justice concerns: resource-rich parties can afford comprehensive review, while others may be forced to proceed with partial or superficial analysis. Properly constrained AI-assisted review mitigates that imbalance almost entirely by making comprehensive analysis practically attainable for any person with access to a computer.¹⁰⁰ Large law firms have been taking advantage of this benefit since GPT 3’s introduction in November 2021.
AI-assisted analysis of large records serves the necessity function in multiple, interrelated ways. It addresses scale (processing volumes that would require prohibitive human hours), consistency (applying uniform criteria across thousands of documents without fatigue-related drift), cost-effectiveness (a fraction of the cost of equivalent human time), speed (hours rather than weeks or months), and pattern detection (surfacing cross-document regularities or anomalies that linear human review may miss).¹⁰¹ Each of these factors speaks directly to whether admitting such analysis is reasonably necessary to permit meaningful engagement with the evidentiary record.
Critically, necessity does not demand that AI-assisted analysis be the only possible method. It is sufficient that it constitutes a reasonably necessary means of addressing the evidentiary challenge. As Professor David Paciocco has observed, modern hearsay doctrine accommodates pragmatic considerations that facilitate effective litigation while preserving fairness and adversarial testing.¹⁰²
The Supreme Court’s reasoning in R. v. Baldree reinforces this approach. There, the Court cautioned against assessing necessity by reference to abstract possibilities divorced from real-world constraints. Where comprehensive analysis of voluminous records would be prohibitively expensive or time-consuming without technological assistance, necessity is established.¹⁰³
B. Reliability: Threshold Guarantees of Trustworthiness
The reliability prong of the principled hearsay framework asks whether the circumstances surrounding the evidence provide “sufficient guarantees of trustworthiness” to justify admission, leaving ultimate weight to the trier of fact.¹⁰⁴ In the case of AI-assisted documentary analysis, multiple, overlapping factors support threshold reliability.
1. Absence of Motive to Fabricate
As the Supreme Court recognized in Starr, reliability may be established where “the declarant had no motive to lie.”¹⁰⁵ AI systems lack motive in any meaningful legal sense. They do not possess interests, preferences, or incentives tied to litigation outcomes. While AI systems can err, those errors are mechanical or computational—not intentional distortions arising from self-interest, fear, or faulty memory. This sharply distinguishes AI-assisted outputs from the classic hearsay concern that an out-of-court human declarant may consciously or unconsciously misrepresent facts.¹⁰⁶
Further, where AI-assisted analysis is used transparently and in a record-bound manner, the human party tendering the evidence faces strong disincentives against fabrication. Because outputs can be checked directly against source documents, inaccuracies are readily discoverable and reputationally costly. This structural safeguard parallels the rationale underlying the business-records exception, where routine, verifiable processes reduce the risk of fabrication.¹⁰⁷
2. Verifiability and Testing
Starr also emphasized that reliability may be established through “safeguards in place such that a lie could be discovered.”¹⁰⁸ AI-assisted documentary analysis provides precisely such safeguards. Outputs can be verified against the underlying documents, replicated by opposing parties using the same or different tools, and tested for internal consistency. Any errors are discoverable through direct comparison with the record, and the mechanical nature of AI processing supports consistent results across replications.¹⁰⁹
This verifiability meaningfully distinguishes AI-assisted analysis from problematic hearsay. Traditional hearsay is suspect because the declarant cannot be cross-examined. By contrast, AI outputs analyzing documents can be functionally cross-examined through replication and source-checking. As Professor Lisa Dufraimont has observed, modern hearsay doctrine increasingly focuses on functional substitutes for cross-examination rather than insisting on literal confrontation. Record-bound AI analysis supplies such substitutes through transparency, testing, and replication.¹¹⁰
3. System Integrity and Empirical Validation
Canadian evidence law has long adopted a functional approach to electronic evidence, focusing on the reliability of the system producing the output.¹¹¹ Applied to AI tools, system reliability is supported by extensive empirical validation demonstrating high accuracy on constrained tasks, reproducibility across repeated analyses of identical materials, widespread professional and commercial adoption, continuous refinement through updates, and third-party evaluation in peer-reviewed research.¹¹²
The Supreme Court’s reasoning in R. v. C.B. applies with equal force here: reliability should be assessed by examining the integrity of the system and process, not by reflexive skepticism toward unfamiliar technology.¹¹³
Empirical work from the University of Cambridge’s Centre for AI Safety further supports system integrity. Modern large language models undergo extensive pre-deployment testing and “red-team” evaluation. GPT-4, for example, was subjected to approximately six months of safety testing before release; Claude 3.5 Sonnet underwent a similarly intensive evaluation regime. These processes provide assurances analogous to those expected of other analytical instruments routinely relied upon in court.¹¹⁴
4. Constrained Task Domain
Reliability is substantially enhanced when AI is confined to record-bound tasks rather than open-ended content generation. Empirical research shows that hallucination rates decline sharply when models are instructed to analyze only provided materials. In this constrained domain, AI operates not as an autonomous narrator but as an analytical instrument processing defined inputs.¹¹⁵
The analogy to laboratory instrumentation is instructive. Courts routinely admit outputs from chromatographs, DNA sequencers, and breathalyzers because these machines apply consistent analytical processes to known samples, producing results that are verifiable and reproducible. Although such machines can err, their errors are detectable. AI-assisted analysis of documentary records operates in an analogous fashion: defined inputs, consistent processing, and outputs capable of verification and replication.¹¹⁶
As Professor Margot Kaminski has observed, the critical reliability question is not whether AI can err, but whether errors are detectable and correctable:
“All analytical tools can produce errors—that’s true of DNA sequencers, breathalyzers, and human analysts. The question is whether the system provides mechanisms for detecting and correcting errors.”
Record-bound AI analysis provides precisely such mechanisms through verification against source materials.¹¹⁷
5. Professional Oversight and Verification
Finally, reliability is further strengthened when AI-assisted analysis is conducted under competent professional supervision and verified before submission. Canadian law-society guidance consistently emphasizes that lawyers remain responsible for checking AI outputs. When such oversight is applied, AI analysis benefits from mechanical consistency combined with human judgment.
Empirical research confirms the value of this hybrid approach. Studies consistently show that AI-human collaboration outperforms either method alone. One 2024 study reported that when lawyers reviewed and corrected AI-generated document summaries, final accuracy exceeded 98%, significantly higher than unaided human or AI review. This collaborative model supplies strong threshold reliability while preserving the adversarial role of counsel.¹¹⁸
C. Authentication and the Best Evidence Rule
Beyond hearsay, AI-assisted analysis must also satisfy authentication and best-evidence requirements. Canadian evidence law addresses these concerns through functional provisions governing electronic documents.¹¹⁹
Section 31.1 of the Canada Evidence Act confirms that electronic documents are admissible under the same general rules as other documentary evidence, while imposing an authenticity threshold tailored to the electronic context. Section 31.2 addresses best-evidence concerns by directing the inquiry to the integrity of the electronic documents system by or in which the record was recorded or stored, and s. 31.3 sets out how integrity may be established, including by statutory presumptions (absent evidence to the contrary).¹²⁰ The statutory focus is thus not on technological novelty, but on whether the processes used to create, store, and present the evidence reliably preserve what it purports to represent.
Applied to AI-assisted analysis of documentary records, authentication requires proof of four interrelated elements: (1) the authenticity of the underlying source documents; (2) the integrity of the process by which those documents were provided to the AI system; (3) the methodology employed in conducting the analysis; and (4) the completeness and accuracy of the outputs as representations of what the AI actually produced.¹²¹ None of these elements requires the court to accept the correctness of the analysis; they require only that the evidence be what it claims to be.
In practice, these requirements are readily satisfied through affidavit evidence describing the workflow, preservation of the source documents for independent verification, and disclosure of the specific AI tools, model versions, and prompts used. Because record-bound AI analysis is transparent and replicable, opposing parties can authenticate the process by re-running the analysis or checking outputs directly against the source materials. This procedural openness aligns closely with the functional approach to electronic evidence embodied in ss. 31.1–31.3.¹²²
As Michael Geist of the University of Ottawa Faculty of Law has observed, authentication in this context should focus on process integrity rather than outcome verification: “Authentication asks whether the evidence is what it purports to be, not whether it’s accurate. For AI-assisted analysis, authentication requires establishing that the AI analyzed the claimed documents using the disclosed methodology and that the outputs accurately reflect what the AI produced. Accuracy of the analysis goes to weight, not authentication.”¹²³

VI. The Functional Distinction: Machine-Generated Evidence Versus Human Statements
A. The "Declarant" Requirement and Non-Human Outputs
A foundational premise of hearsay doctrine is that it targets human statements offered for their truth. The U.S. Federal Rules of Evidence, for example, define hearsay as an out-of-court “statement” offered to prove the truth of what it asserts, and they define “declarant” as “the person who made the statement.”¹²⁴ That person-centred definition creates categorical space for certain machine outputs that are not, in any meaningful sense, a “statement” made by a “person.”
Consistent with that structure, American courts have repeatedly treated mechanical outputs from analytical instruments as non-hearsay precisely because no human declarant is making an assertion—there is only a system producing an output from inputs according to its operation.¹²⁵
Although Canadian hearsay doctrine is not generally codified in a single definition, the Supreme Court’s hearsay jurisprudence likewise presupposes a human declarant: it is concerned with out-of-court statements by persons and the risks of human perception, memory, narration, and sincerity. That conceptual framing creates principled room to treat certain machine-generated analytical outputs differently from human statements—subject, of course, to process integrity and reliability safeguards.¹²⁶
As Professor Craig Jones has argued, Canadian evidence law should more explicitly recognize this machine-evidence distinction: “Hearsay concerns arise from the inability to test human perception, memory, narration, and sincerity. Machines don’t perceive, remember, narrate, or exercise sincerity. They process inputs according to algorithms. When we admit machine evidence, we’re not admitting someone’s statement about reality; we’re admitting a machine’s output reflecting its processing of inputs.”¹²⁷
B. AI as Analytical Instrument Versus AI as Autonomous Agent
For evidentiary purposes, the critical distinction is not “AI versus no AI,” but how AI is being used.¹²⁸ Two paradigms matter.
Paradigm 1: AI as an analytical instrument
Here, AI is used to analyze provided documents—to index, extract, compare, summarize, or flag patterns—while being instructed to remain confined to the record. The AI functions as a tool: sophisticated, but instrumental, producing outputs that are anchored to defined inputs and therefore verifiable.¹²⁹
Key characteristics include: (i) input dependency (outputs derive from the provided record); (ii) methodological transparency (the workflow is disclosed and can be replicated); (iii) verifiability (outputs can be checked against source materials); (iv) mechanical consistency (identical inputs under the same instructions yield stable results); and (v) human oversight (results are reviewed, corrected, and responsibly presented by counsel).¹³⁰
Used in this way, AI resembles other analytical instruments already familiar to courts: document-management systems that index and search databases, statistical tools that analyze quantitative data, and even calculators that perform defined operations. The output reflects processing of known inputs—not autonomous generation of new factual claims.¹³¹
Paradigm 2: AI as an autonomous agent
Here, AI is used as an open-domain generator—asked to produce case law, facts, or propositions from its general training data without being confined to provided sources. In this paradigm, the AI functions as an autonomous information source, capable of producing plausible content that may not correspond to reality. This is the paradigm that characterizes the hallucination cases.¹³²
Key characteristics include: (i) training-data dependency (outputs derive from the model’s training, not the evidentiary record); (ii) black-box sourcing (the actual sources informing outputs are unknown and cannot be audited); (iii) unverifiability at the point of generation (no reliable “trail” to confirm provenance); (iv) risk of fabrication (plausible but false authorities or assertions); and (v) lack of source accountability (no systematic mechanism to trace and correct the origin of an error).¹³³
That is the functional description of what went wrong in the hallucination cases: lawyers asked the tool to supply case law from general knowledge, and it generated convincing but fictitious citations—assertions about what the law “says” that were neither anchored to the record nor reliably verifiable.¹³⁴
The evidentiary implications diverge sharply. In Paradigm 1, AI-assisted analysis is instrumental: the outputs are verifiable against the record and replicable through repeated analysis. In Paradigm 2, AI is effectively a synthetic “speaker” generating propositions from unknowable sources without built-in verification. Canadian evidence law—through both the principled hearsay framework and the functional approach to electronic evidence—has the tools to admit the former while excluding the latter, because both strands ultimately privilege reliability, transparency, and testability.¹³⁵
Professor Harry Surden captures the distinction in practical terms: “The difference between asking AI to cite relevant cases and asking AI to analyze documents you’ve provided is the difference between asking it to be an expert witness and asking it to be a document reviewer. We don’t let AI be expert witnesses because they can make things up. But we can let AI be document reviewers because we can verify their work.”¹³⁶
VII. Procedural Fairness and the Role of AI Analysis in Administrative Proceedings
A. The Vavilov Framework and Reasonableness Review
SdRecent developments in Canadian administrative law—most notably the Supreme Court’s decision in Canada (Minister of Citizenship and Immigration) v. Vavilov—provide important context for assessing AI-assisted analysis in regulatory and administrative proceedings.¹³⁷ Vavilov establishes that reasonableness is the presumptive standard of review and that a reasonable decision must be “based on an internally coherent and rational chain of analysis” and “justified in relation to the constellation of law and facts that are relevant to the decision.”¹³⁸
A central feature of the Vavilov framework is the requirement that decision-makers meaningfully grapple with the evidence before them. The Court made clear that it is insufficient to merely recite the law and summarize facts; rather, decision-makers must demonstrate that the decision was actually made in light of the relevant evidence and arguments. Where key evidence or central submissions are ignored or dismissed without explanation, the resulting decision may be unreasonable.¹³⁹
The Court emphasized that reviewing courts must examine administrative reasons with “respectful attention,” seeking to understand the reasoning path taken by the decision-maker. But that deference is not blind. As the Court explained, a reasonable decision is one that is justified both internally (logical coherence) and externally (alignment with the evidentiary and legal constraints governing the decision).¹⁴⁰ A failure to address salient evidence undermines both dimensions of justification.
These principles have direct implications where parties submit AI-assisted analyses of voluminous records to administrative bodies. If such analyses identify patterns, anomalies, or evidentiary concerns that are facially relevant to the statutory mandate, a decision-maker cannot simply dismiss them on the basis that they were “AI-generated” or technologically novel. Summary rejection without engagement risks contravening Vavilov’s requirement that decisions be justified in relation to the evidentiary record. Where AI-assisted analysis functions as an organized, record-bound synthesis of the evidence that is verifiable—rather than an assertion of new facts—a refusal to grapple with it is unreasonable if the case could turn on it.¹⁴¹
As Professor Paul Daly of the University of Ottawa Faculty of Law has observed, Vavilov reinforces an evidence-centred conception of administrative justice:
“Vavilov makes clear that decision-makers must grapple with the evidence and arguments before them. When a party submits detailed analysis—whether AI-assisted or otherwise—that raises facially legitimate concerns, dismissal without substantive engagement is problematic.”¹⁴²
Under Vavilov, the procedural fairness and reasonableness inquiry turns not on the tool used to organize or analyze evidence, but on whether the decision-maker meaningfully engaged with the substance of what was put before them.
B. Fresh Evidence and Procedural Fairness
The Supreme Court’s decision in Nova Scotia (Attorney General) v. Judges of the Provincial Court and Family Court of Nova Scotia clarifies when reviewing courts may consider fresh evidence on judicial review.¹⁴³ The Court held that such evidence may be admissible where it is necessary to resolve issues of procedural fairness, cautioning that “[t]he exclusion of this evidence from the record would undermine the reviewing court’s ability to deal with central issues.”¹⁴⁴ The focus is not on supplementing the merits record, but on ensuring the reviewing court can properly assess whether the decision-making process itself was fair.
This principle applies with particular force where fresh evidence shows that a party made timely submissions addressing core concerns, but those submissions were excluded from consideration due to arbitrary or overly rigid record limitations. Where a party submits AI-assisted analysis of the record before the administrative decision is rendered—and that analysis identifies significant evidentiary issues—excluding the submission from the record on review may itself undermine procedural fairness.¹⁴⁵ In such circumstances, the fresh evidence does not seek to relitigate the merits; it demonstrates what was placed before the decision-maker and how it was (or was not) treated.
Importantly, the procedural fairness inquiry is distinct from questions of evidentiary admissibility or weight. Even if one were to dispute how much probative value an AI-assisted analysis ultimately deserves, procedural fairness may still require that the decision-maker acknowledge and address it when it forms part of a party’s submissions. The operative question is not “Is this evidence admissible?” but rather “Did the decision-maker discharge its duty to consider the party’s submissions?” These inquiries serve different functions and are governed by different standards.¹⁴⁶
As Laverne Jacobs of the University of Windsor Faculty of Law has observed, participatory rights lie at the core of procedural fairness: “Procedural fairness requires that parties have meaningful opportunities to be heard and that decision-makers consider their submissions. When a party provides detailed analysis of the record—even if AI-assisted—that analysis constitutes part of their submissions. Ignoring it entirely may breach participatory rights.”¹⁴⁷
C. Expectations of Engagement with Submitted Analysis
When a party submits detailed analysis of an evidentiary record—whether human-generated or AI-assisted—what level of engagement should be expected from administrative decision-makers?¹⁴⁸ Several factors inform the answer.
1. The nature of the allegations
Where the submitted analysis identifies patterns suggestive of serious misconduct, Vavilov’s reasonableness framework points toward meaningful engagement. If (for example) AI-assisted analysis of billing records flags systematic overbilling patterns, or analysis of sworn statements highlights pervasive inconsistencies, these issues cannot be brushed aside without a substantive response. The seriousness of the allegation affects what “reasonable engagement” requires: minor discrepancies may justify brief reasons; allegations of systematic fraud or institutional misconduct typically demand a more careful investigation and a reasoned explanation.¹⁴⁹
2. Verifiability of claims
Engagement expectations rise where the submission makes verifiable, record-anchored claims—e.g., “Document X contains statement Y,” “Email Z was sent on date A,” “Invoice line-items duplicate time entries.” When verification is straightforward, an unexplained dismissal is harder to justify. In practical terms, a decision-maker can test a sample of key claims against the record before rejecting them, and should say so in the reasons.¹⁵⁰
As Professor Lorne Sossin of the University of Toronto Faculty of Law puts it: “When parties make verifiable factual claims about the record, decision-makers can’t simply ignore them. Vavilov requires grappling with evidence. If the claims are false, say so and explain why. If they’re true, address their implications.”¹⁵¹
3. Timing of submission
Submissions delivered during the decision-making process—before reasons issue—carry greater procedural significance than post-decision attempts to add new material. Pre-decision submissions give the decision-maker a fair opportunity to investigate, request clarification, or adjust reasoning. Administrative law does not permit endless re-litigation through rolling evidence, but it does expect that timely, properly presented submissions made within the process will be considered.¹⁵²
4. Disclosure of methodology
Where the party discloses the analytical method (including AI tool, model/version, prompts/instructions, and outputs) and provides the underlying materials for checking, the administrative body has little basis to avoid engagement. Transparency enables verification, replication (if necessary), and focused disagreement (identifying exactly what is wrong and why). Conversely, where methodology is undisclosed or sources are not provided, a decision-maker may reasonably give the analysis less weight and explain that limitation.¹⁵³
Minimum expectations. Under these principles, an administrative body faced with AI-assisted record analysis should, at minimum: (a) acknowledge receipt and review of the submission; (b) verify key record-anchored claims (or explain why verification was not feasible); (c) provide reasons addressing the analysis or explaining why it does not affect the outcome; and (d) engage with the substance of the allegations rather than dismissing the submission categorically because “AI” was involved. Failure to meet these baseline expectations can amount to procedural unfairness and/or unreasonable decision-making under Vavilov.¹⁵⁴
VIII. The Transformative Impact of AI on Truth-Finding and Access to Justice
A. AI as a Tool Against Fraud and Concealment
Beyond technical evidentiary doctrine, AI’s integration into legal proceedings has broader implications for truth-finding and access to justice.¹⁵⁵ One of the most practically important impacts is AI’s capacity to surface fraud, concealment, and misconduct that can otherwise remain hidden inside scale.
Modern fraud frequently exploits information asymmetry. Where one side controls voluminous records, misconduct can be buried in plain sight: an overbilling lawyer can hide improper entries among thousands of legitimate time lines; a corporation can bury damaging admissions in tens of thousands (or millions) of emails; an agency can obscure policy deviations across years of files—confident that opponents may lack the resources to audit the record comprehensively.¹⁵⁶
Record-bound AI tools change that calculus by democratizing document analysis. A sole practitioner—or a self-represented litigant—can now run structured review across thousands of pages at relatively low cost, triaging where to look, extracting repeated patterns, and identifying inconsistencies that would otherwise require prohibitive human hours. That capability shifts litigation dynamics by reducing the extent to which well-resourced parties can exploit volume as a shield.¹⁵⁷
Research associated with Georgetown Law’s Institute for Technology Law & Policy has described examples in which AI-assisted analysis helped uncover misconduct that was practically difficult to detect through conventional review alone. In a 2024 employment discrimination matter, counsel reportedly used AI to analyze roughly 50,000 company emails, identifying a pattern of discriminatory statements that would likely have been missed within ordinary time and budget constraints. In a 2025 regulatory investigation, AI review of billing records from a professional services firm reportedly flagged statistically improbable time entries and nearly identical billing descriptions across unrelated matters, revealing systematic overbilling patterns that had escaped client-by-client scrutiny. In a complex fraud matter, AI-assisted mapping of transactions reportedly revealed a shell-company network and circular transfers consistent with concealment, with the pattern becoming evident only when thousands of transactions were analyzed as a connected structure.¹⁵⁸
These examples illustrate a larger point: record-bound AI can serve as a force-multiplier against concealment by making comprehensive record interrogation feasible for parties who previously could not afford it. As Professor Margaret Hagan of Stanford Law School has noted: “AI tools have the potential to democratize legal analysis in ways that serve access to justice. What once required teams of associates now can be accomplished by individuals with the right tools. This doesn’t eliminate the need for lawyer judgment, but it makes comprehensive analysis feasible for those who previously couldn’t afford it.”¹⁵⁹ Professor Samuel Estreicher of New York University School of Law similarly observes: “AI-assisted document analysis is particularly important for individual litigants and smaller firms facing well-resourced opponents. The ability to comprehensively analyze voluminous discovery without prohibitive cost reduces David-versus-Goliath dynamics.”¹⁶⁰
B. Judicial Recognition of AI's Growing Role
Canadian courts have begun to treat generative AI as a practical reality in modern litigation while insisting that its use must occur inside the existing professional-duty framework (candour, diligence, and competence). The most “AI-facing” decisions to date are best read as verification-and-disclosure cases, not anti-technology pronouncements: the courts’ concern is the filing of unreliable or undisclosed AI-assisted material, not the mere fact that AI was used.¹⁶¹
In Ko v. Li, Justice Myers situated the issue within the ordinary “bedrock” duties of counsel to cite law honestly and without misrepresentation, and then explained why Ontario’s Rules were amended in 2024 to require factum counsel to certify the authenticity of every authority cited. The Court described that amendment as a targeted response to the “new phenomenon of AI hallucinations,” while also acknowledging the broader context that “AI is ubiquitous” and its “risks and weaknesses are not yet universally understood.” The thrust is constructive rather than prohibitory: because AI is now widespread, the legal system is reinforcing process safeguards (authentication and verification) to preserve the integrity of written advocacy.¹⁶²
The Federal Court’s decision in Hussein v. Canada (Immigration, Refugees and Citizenship) (Associate Judge Moore) takes the same approach even more explicitly. The Court held that “the real issue is not the use of generative artificial intelligence but the failure to declare that use,” pointing to the Federal Court practice direction requiring disclosure of any generative-AI use in the first paragraph of the document so that opposing counsel and the Court are “on notice” and can do the “necessary due diligence.”¹⁶³ Crucially for a favourable framing, the Court expressly acknowledged “the significant benefits of artificial intelligence” and stated that it “is not trying to restrict its use,” characterizing the disclosure requirement as “some protection against the documented potential deleterious effects.”¹⁶⁴
Professional regulators have adopted the same balanced posture. The Law Society of Ontario’s April 2024 white paper is expressly framed as helping licensees understand generative AI and clarifying how existing professional-conduct rules apply when legal services are “empowered by” generative AI—i.e., it treats AI as a tool that may be used, provided lawyers remain responsible for competence, confidentiality, and oversight.¹⁶⁵ At the national level, the Federation of Law Societies of Canada Model Code embeds a duty of technological competence in the competence rule commentary, emphasizing that lawyers should develop an understanding of and ability to use relevant technology and understand its benefits and risks—again reinforcing integration with professional responsibility rather than prohibition.¹⁶⁶
C. The Inevitability of AI Integration
Perhaps most fundamentally, the question is no longer whether AI will be integrated into legal practice, but how that integration will be managed. The adoption curve is already steep enough that AI use is moving from “early adopter” behavior to mainstream practice reality. A LexisNexis Canada survey reported 93% awareness of generative AI among Canadian lawyers—and more than half reporting they have already used generative AI tools.¹⁶⁷
Other industry surveys suggest even broader penetration. Embroker’s 2025 Legal Industry Risk Index reported AI usage among legal professionals surging from 22% in 2024 to 80% in 2025—a jump that is difficult to reconcile with any premise that AI remains optional or marginal.¹⁶⁸ North American practice data points in the same direction: Clio reported that 79% of legal professionals were incorporating AI tools into their daily work in 2024, up from 19% in 2023. In-house benchmarks also suggest that once AI is adopted, it becomes sticky: a 2025 Counselwell/Spellbook benchmarking report found that among legal department professionals already using AI, 97% described it as effective (“somewhat” 63%) and (highly 34%).¹⁶⁹
At the institutional level, the same pattern holds: experimentation is now widespread even where full implementation is still cautious. A Best Lawyers survey found that 80% of firms with more than 20 lawyers were researching or piloting generative AI tools, while only 7% reported full implementation across multiple practice areas—evidence of rapid integration-by-piloting rather than a stable “wait-and-see” equilibrium. In short, the adoption base is already too large, and the integration pathway too embedded in mainstream tools and workflows, for categorical exclusion to be either feasible or desirable. The practical task becomes developing standards for transparency, verification, and professional responsibility that separate reliable, record-bound use from unreliable open-domain generation.¹⁷⁰
IX. Proposed Framework for Admitting AI-Assisted Document Analysis
A. Core Principles
This section aggregates Canadian evidentiary doctrine, emerging judicial guidance on AI misuse, and the practical safeguards that make record-bound analysis testable¹⁷¹ into a coherent framework for evaluating AI-assisted document analysis.
Principle 1: Functional assessment, not categorical exclusion
AI-assisted analysis should be assessed by how it was used and whether its outputs are sufficiently reliable for threshold admission, not excluded merely because “AI” is involved. This aligns with Canada’s functional approach to electronic evidence and the principled approach to hearsay, which focus on whether safeguards overcome the dangers that flow from reduced testability.¹⁷²
Principle 2: Transparency as a threshold condition
Admissibility should depend on disclosure of (i) the tool (model/version); (ii) the prompts/instructions (including iterations); (iii) the source set analyzed; (iv) the complete outputs (or an export sufficient to audit what was produced); and (v) any human editing or post-processing. Without such transparency, adversarial testing and meaningful reliability assessment are impossible.¹⁷³
Principle 3: Record-bound use as a strong reliability indicator
A bright practical line should be drawn between record-bound analysis (closed-domain: analyzing identified documents) and open-domain generation (using training data to “produce” facts or authorities). The former is structurally verifiable; the latter recreates the very conditions that produced the hallucination cases and is far harder to validate.¹⁷⁴
Principle 4: Verifiability as the core reliability guarantee
Where an output can be checked directly against the source record—e.g., “Document X contains statement Y,” “Email Z was sent on date A”—threshold reliability is ordinarily satisfied because the opposing party can test the claim and the court can spot-check it. This supplies a functional analogue to cross-examination for record-anchored assertions.¹⁷⁵
Principle 5: Replicability as an additional safeguard
A disclosed, record-bound workflow can be reproduced by the opposing party (same tool or a different one) and compared. Convergent results increase confidence; divergences identify what must be adjudicated. Replicability is not a magic wand, but it is a powerful reliability check when coupled with verifiability.¹⁷⁶
Principle 6: Human oversight remains a professional obligation
AI does not displace counsel’s responsibility. Courts and regulators emphasize that lawyers must understand limitations, verify outputs, and remain accountable for what is filed. Human review is therefore part of the reliability story: it reduces mechanical errors and prevents over-claiming.¹⁷⁷
Principle 7: Weight versus admissibility
Once threshold reliability is met (through transparency + record-bounding + verifiability, supported by oversight), most remaining concerns go to weight, not admissibility. This tracks the Supreme Court’s distinction between “threshold” and “ultimate” reliability: the gatekeeping question is whether safeguards permit rational evaluation, not whether the evidence is infallible.¹⁷⁸
B. Proposed Admissibility Test
Building on these principles, courts could apply a structured, technology-neutral test for AI-assisted document analysis.¹⁷⁹
Step 1: Record-bound or open-domain
Is the analysis confined to identified source documents provided to the tool? If no, it presumptively falls into the hallucination-risk paradigm and should be excluded absent exceptional justification. If yes, proceed.¹⁸⁰
Step 2: Transparency disclosure
Has the proponent disclosed the tool (model/version), prompts/instructions, source set, and outputs sufficient to audit and reproduce the work? If no, exclude for lack of adversarial testability. If yes, proceed.¹⁸¹
Step 3: Verifiability against the record
Can the material assertions be checked against the source documents (with citations/links/exhibit references)? If no, exclude for lack of demonstrable reliability. If yes, proceed.¹⁸²
Step 4: Error profile shown through testing
Has the opposing party (or the court’s own spot-checking) identified material errors when outputs are compared to the source documents?
-
Isolated or immaterial errors ordinarily go to weight, not admissibility.
-
Systemic, recurrent, or outcome-significant errors may justify exclusion or admission only for a limited purpose (e.g., as demonstrative/argumentative aid rather than proof of truth).¹⁸³/¹⁸⁴
Step 5: Truth-seeking balance in the context of the record
Would admission advance the truth-seeking function by making voluminous evidence intelligible and testable, or do the remaining reliability concerns outweigh probative value in the circumstances? This final step ensures the framework remains purposive rather than formalistic.¹⁸⁵
C. Application to Common Scenarios
Scenario 1: AI analysis of discovery documents
A party receives 10,000 pages of discovery and uses an AI tool to identify all references to a particular contract. The party discloses prompts, tool/version, outputs, and preserves the record for verification.
Result: Admissible. It is record-bound, transparent, verifiable, replicable, and directed to organizing record content. Errors are readily testable and ordinarily go to weight.¹⁸⁶/¹⁸⁷
Scenario 2: AI-generated case summaries from general knowledge
Counsel asks an AI system to “summarize the leading cases” on an issue without providing the cases to be analyzed. The system generates summaries from training data.
Result: Presumptively inadmissible for evidentiary use. This is open-domain generation with a known risk of fabricated authorities and unverifiable provenance—the paradigm at the core of the hallucination cases.¹⁸⁸
Scenario 3: AI analysis of financial records
A party uses AI tools to flag suspicious transaction patterns in bank records, discloses methodology and prompts, and preserves all source documents.
Result: Admissible (subject to weight). The output is a triage/pattern-flagging instrument tethered to source records, which the opposing party can test transaction-by-transaction; disagreement usually concerns significance, not existence.¹⁸⁹
Scenario 4: Undisclosed AI use in brief drafting
Counsel uses AI to assist drafting but does not disclose the drafting tool. The brief contains legal arguments and record citations.
Result: This is primarily a professional-conduct / court-rule issue rather than an evidentiary admissibility question. The admissibility focus remains on whether the record assertions are supported and verifiable; the drafting method may still matter if rules require disclosure or if it produced inaccuracies.¹⁹⁰
Scenario 5: AI-assisted timeline construction
A party uses AI to build a timeline from 5,000 emails and documents, discloses the method, and provides exhibit-anchored citations for each entry.
Result: Admissible. Timeline entries are verifiable against the record; the product assists comprehension of voluminous materials; disputes are typically about completeness or characterization (weight), not admissibility.¹⁹¹
X. Professional Responsibility and Ethical Considerations
A. Duties of Verification and Competence
The “hallucination” cases uniformly reaffirm that lawyers must verify AI-assisted outputs before relying on them in materials filed with a court. This is not a novel AI-specific obligation; it flows from baseline professional duties of competence, candour, and diligence that apply regardless of the tools used.¹⁹²
Ontario’s competence rule states that a lawyer must perform legal services undertaken on a client’s behalf to the standard of a competent lawyer.¹⁹³ Read in a modern practice environment, competence necessarily includes understanding the strengths and limits of the technologies a lawyer chooses to deploy and maintaining sufficient oversight to ensure accuracy.
Accordingly, when using generative AI tools, the lawyer’s duty set can be stated in operational terms: counsel must understand what the tool can and cannot reliably do; verify outputs against authoritative sources or the record; comply with disclosure requirements that affect the integrity of proceedings; and remain personally responsible for what is submitted.¹⁹⁴ Courts have emphasized verification in concrete terms. In Ko v. Li, Justice Myers explained that Ontario’s factum-certification rule was enacted in response to “AI hallucinations” and requires counsel to be satisfied as to the authenticity of every authority cited—reinforcing that counsel must not file AI-assisted work product unless it has been checked.¹⁹⁵
Canadian regulators have issued parallel guidance. The Law Society of British Columbia warns that generative AI can produce incorrect or misleading outputs and stresses that lawyers must exercise professional judgment and verify AI-assisted work product to ensure it meets professional standards (including accuracy and reliability).¹⁹⁶ The Law Society of Alberta has similarly emphasized technological competence: lawyers must understand the tools they use and avoid “blind reliance” on AI outputs without verification, because that kind of reliance is incompatible with competence obligations.¹⁹⁷ The Canadian Bar Association likewise frames ethical AI use as an application of ordinary duties: failing to use AI competently can expose clients to risk and can lead to sanctions by courts or discipline by regulators.¹⁹⁸
Verification duties are also context-sensitive. Case citations require confirmation that each authority exists and stands for the stated proposition; document summaries require record spot-checking; pattern claims require confirmation that the pattern exists in the source set; and legal conclusions require checking against binding law and the evidentiary record.¹⁹⁹
B. Disclosure Obligations
Disclosure obligations depend on court rules/practice directions and on whether AI use is material to the integrity and testability of what is being filed.²⁰⁰ The Federal Court has adopted an express disclosure regime requiring a declaration when AI is used to generate content in documents filed with the Court, with the declaration made in the first paragraph of the document and identifying where AI-generated content appears.²⁰¹ This regime captures a key policy point that supports your thesis: disclosure is required so the other side and the Court are on notice and can conduct appropriate due diligence.
Outside mandatory-disclosure settings, disclosure is often still prudent where AI use is substantive (e.g., record analysis tendered to support an evidentiary position) because transparency reduces unfairness and avoids the appearance of concealment.²⁰² Conversely, purely ministerial uses—formatting, spelling, administrative drafting aids—will not typically trigger meaningful disclosure concerns unless a rule says otherwise.²⁰³
C. Client Consent and Confidentiality
Using AI tools with client materials raises confidentiality and privilege risks because many AI services involve third-party processing, variable retention practices, and contractual terms that may permit storage, logging, or secondary use depending on the product and settings.²⁰⁴ Canadian law society guidance consistently stresses that lawyers must assess privacy/security implications before using AI with confidential information and should understand the platform’s data-handling practices (retention, training use, access controls), then adopt safeguards proportionate to the risk.²⁰⁵
The Law Society of Alberta’s AI materials emphasize that lawyers must consider how the service handles data and whether it may be retained or used for other purposes, and that informed client consent may be advisable where confidential client information will be processed through such tools.²⁰⁶ The Law Society of Ontario’s white paper similarly highlights that licensees must consider confidentiality, privacy, and privilege when using generative AI, including reviewing terms of service and documenting the lawyer’s risk assessment and safeguards.²⁰⁷
Best practices therefore include: preferring enterprise-grade or appropriately configured tools with stronger privacy commitments; minimizing or redacting sensitive identifiers where feasible; confirming retention/training settings; obtaining informed client consent where warranted; and documenting the safeguards and decisions in the file.²⁰⁸
XI. Future Directions and Emerging Issues
A. Evolving AI Capabilities
AI technology continues evolving rapidly, creating moving targets for legal frameworks. Current large language models significantly outperform their predecessors from just two years ago. Forthcoming developments include:209
Multimodal Analysis
Newer AI systems analyze not just text but images, audio, and video. GPT-4 and similar tools can process visual materials, potentially transforming evidence analysis. This capability will enable comprehensive analysis of diverse evidence types: photographs, surveillance footage, medical imaging, architectural drawings, and other visual materials.210
Research from MIT demonstrates that multimodal AI can achieve 92% accuracy in extracting information from complex diagrams and technical drawings. Legal applications include analyzing accident scene photographs, reviewing construction plans, and examining medical imaging.211
Specialized Legal Models
AI models trained specifically on legal materials demonstrate enhanced accuracy for legal tasks. Tools like Harvey AI, LexisNexis's Lexis+ AI, and Thomson Reuters's CoCounsel employ legal-specific training that can bolster legal document analysis.212
A 2025 comparative study found that legal-specific AI models outperformed general models on contract analysis (96% vs. 91% accuracy) and legal issue identification (94% vs. 88% accuracy). As specialized models proliferate, courts may reasonably give them greater weight than general-purpose models.213
Integrated Verification
Some AI systems now incorporate verification mechanisms, checking their outputs against source materials automatically. Anthropic's Claude, for example, includes features that flag when responses may not be fully grounded in provided context. These built-in safeguards may enhance reliability and reduce verification burdens.214
Explainable AI
Emerging "explainable AI" techniques provide insight into how AI reaches conclusions, addressing "black box" concerns. Rather than simply providing outputs, explainable AI systems show which portions of source materials influenced particular conclusions. This transparency facilitates verification and builds confidence in AI analysis.215
As these capabilities develop, evidentiary frameworks must remain flexible enough to accommodate improvements while maintaining appropriate safeguards.
B. Regulatory Developments
Several jurisdictions are developing AI-specific regulations that may affect evidentiary use:216
European Union AI Act
The EU's AI Act establishes risk-based regulation of AI systems, with specific provisions for high-risk applications including judicial proceedings. The Act requires transparency, human oversight, and accuracy standards for AI used in legal contexts. While Canada is not bound by EU law, these regulations may influence Canadian policy development.217
Canadian Federal AI Legislation
The Canadian government has proposed the Artificial Intelligence and Data Act (AIDA) as part of Bill C-27. AIDA would establish requirements for high-impact AI systems, including transparency obligations and accountability mechanisms. These requirements may affect how AI can be used in legal proceedings.218
The proposed legislation emphasizes: transparency in AI system design and operation, human oversight of high-impact decisions, accountability mechanisms for AI-caused harms, and assessment requirements for high-impact systems.219
Professional Regulatory Guidance
Law societies across Canada are developing AI-specific guidance and, potentially, formal rules governing AI use in legal practice. The Federation of Law Societies of Canada is coordinating interprovincial approaches to ensure consistency.220
Anticipated developments include: formal rules requiring verification of AI outputs, disclosure requirements for AI use in court submissions, competence standards for AI tool usage, and continuing legal education requirements on AI.221
C. The AI-Human Collaboration Model
Rather than viewing AI as replacement for human judgment, the emerging paradigm treats AI as collaborative partner. Research consistently demonstrates that combined AI-human analysis outperforms either alone.222
This collaboration model suggests that optimal legal practice will involve: AI handling voluminous data processing (pattern identification, comprehensive indexing, initial document review), humans providing contextual judgment (evaluating significance, applying policy considerations, making strategic decisions), iterative interaction (human feedback refining AI outputs through multiple passes), and transparent documentation (clear records of both AI and human contributions).223
Professor Dana Remus of the University of North Carolina School of Law argues: "The future of legal practice isn't human versus AI; it's human plus AI. AI excels at tasks requiring consistency, speed, and comprehensive coverage. Humans excel at tasks requiring judgment, creativity, and ethical reasoning. Combining these complementary strengths produces optimal results."224
This collaborative approach aligns with professional responsibility frameworks emphasizing human oversight while leveraging AI's analytical capabilities. It also aligns with access to justice goals: AI makes comprehensive analysis affordable, while human oversight ensures quality and accountability.225
XII. Conclusion
The integration of artificial intelligence into legal practice represents neither catastrophe nor panacea, but an accelerating evolution that must be managed with clear standards. Properly deployed, AI tools can enhance truth-finding, make comprehensive document analysis affordable, and support the justice system’s core functions—especially in document-heavy matters—provided the use is transparent, appropriately constrained, and open to adversarial testing.²²⁶
Canadian evidence law does not require wholesale doctrinal revision to accommodate transparent, record-bound AI-assisted document analysis. The principled approach to hearsay already distinguishes threshold reliability (gatekeeping) from ultimate reliability (weight), and it focuses on whether there are functional substitutes for trial testing—i.e., whether reliability risks are meaningfully mitigated by the circumstances and safeguards.²²⁷ Likewise, the statutory framework for electronic evidence (authentication and system integrity) is process-focused and technologically neutral, asking whether electronic information is what it purports to be and whether the system integrity is established—questions that map cleanly onto disclosed, reproducible AI workflows.²²⁸
The critical distinction is not whether AI is used, but how it is used. AI that is constrained to analyzing identified, preserved source documents—and that produces outputs that can be checked against those sources—differs categorically from open-domain generation untethered to the record. Courts have recently and forcefully sanctioned the latter “hallucination” paradigm, while also recognizing that the concern is the misuse of the tool, not the mere presence of the tool.²²⁹
Courts should therefore resist categorical rules—whether categorical exclusion (driven by technological unfamiliarity) or categorical admission (driven by commercial popularity). A functional assessment rooted in familiar reliability indicators is the better path: Is the analysis constrained to identified source materials? Is the methodology disclosed sufficiently to permit challenge and replication? Are outputs verifiable against the record? Do the surrounding circumstances supply threshold guarantees of trustworthiness?²³⁰ Where those questions are answered “yes,” AI-assisted document analysis should generally be admitted, with accuracy disputes going to weight through ordinary adversarial testing.
The alternative—treating all AI-assisted analysis as presumptively suspect or practically inadmissible—would be both impractical and counterproductive. Courts themselves are already issuing guidance designed to manage AI’s presence in litigation rather than pretend it can be excluded from modern practice, and recent jurisprudence reflects an effort to deter hallucinations while not restricting responsible use.²³¹
As Richard Susskind has cautioned in the adjacent context of justice technology, decision-makers should avoid “irrational rejectionism”—the reflex to dismiss tools without understanding them or seeing them in operation—because that posture can block reforms that improve access and accuracy.²³² The task is not to romanticize AI, but to regulate its use through standards that enforce disclosure, enable verification, and preserve accountability.
Canadian courts have successfully integrated disruptive technologies before—DNA evidence, electronic records, and digital forensics—by adopting principled, technologically neutral gatekeeping rather than fear-based exclusions.²³³ AI-assisted, record-bound document analysis fits within that tradition when courts insist on transparency and verifiability and when counsel treat AI outputs as checkable work product, not outsourced authority.
The stakes are high. AI will increasingly shape how legal work is conducted and how evidentiary records are navigated. Getting the evidentiary treatment right matters for individual litigants—especially those facing information asymmetry and resource imbalance—and for the integrity and accessibility of the justice system itself.²³⁴ A principled framework that emphasizes transparency, record-boundedness, and verifiability preserves core evidentiary values while allowing courts and parties to benefit from tools that materially improve coverage, consistency, and cost-effectiveness in documentary truth-finding.²³⁵
As AI capabilities continue advancing, frameworks will require ongoing refinement. But the fundamental principles should remain constant: reliability through verifiability, transparency to enable adversarial testing, and functional assessment over categorical rules.²³⁶ Where threshold reliability is established, residual concerns ordinarily go to weight—tested through the adversarial process rather than resolved by blanket exclusion.²³⁷ The legal profession stands at a crossroads: by adopting principled, evidence-based approaches that balance innovation with safeguards, Canadian courts can ensure that AI serves justice rather than undermining it.²³⁸
Footnotes
-
Benjamin Alarie, “The Path of the Law: Towards Legal Singularity” (2016) 66:3 University of Toronto Law Journal 443 at 445.
-
Ajay Agrawal, Joshua Gans & Avi Goldfarb, Prediction Machines: The Simple Economics of Artificial Intelligence (Boston: Harvard Business Review Press, 2018) at 15–18.
-
Richard Susskind, Online Courts and the Future of Justice (Oxford: Oxford University Press, 2019) at 156–162.
-
Sam Altman, “ChatGPT has hit 800 Million weekly active users” (TechCrunch, 6 October 2025); “ChatGPT reaches 900 Million weekly active users” (Seeking Alpha, 9 December 2025).
-
Thomson Reuters Institute, Generative AI in Professional Services: Legal Industry Report (2024) at 6–9, 14–18.
-
Daniel Martin Katz et al, “AI and the Legal Profession: Opportunities, Risks, and Professional Responsibility” (2023) 19 Annual Review of Law and Social Science 321 at 325–330.
-
Canadian Judicial Council, Guidelines for the Use of Artificial Intelligence in Canadian Courts (24 October 2024).
-
Ko v Li, 2025 ONSC 2766; Ko v Li, 2025 ONSC 2965; Zhang v Chen, 2024 BCSC 285.
-
Sara Merken, “U.S. judges warn lawyers to check AI for fake cases but resist bans”, Reuters (5 April 2024).
-
Harry Surden, “Machine Learning and Law” (2014) 89 Washington Law Review 87 at 132–140; John Villasenor, “Products Liability Law as a Way to Address Artificial Intelligence Harms” (Brookings Institution, 2019).
-
R v Truscott, 2007 ONCA 575; R v Baldree, 2013 SCC 35; R v Beauchamp, 2015 ONCA 260.
-
David M Paciocco, Lee Stuesser & Palma Paciocco, The Law of Evidence, 8th ed (Toronto: Irwin Law, 2020) at 173–180.
-
R v Khan, [1990] 2 SCR 531; R v Smith, [1992] 2 SCR 915; R v Khelawon, 2006 SCC 57.
-
R v Khelawon, 2006 SCC 57 at paras 35–42.
-
Ibid at para 49.
-
R v Starr, 2000 SCC 40 at para 31.
-
R v Levogiannis, [1993] 4 SCR 475 at 483; R v Seaboyer, [1991] 2 SCR 577.
-
R v Starr, 2000 SCC 40 at para 215.
-
R v Khelawon, 2006 SCC 57 at paras 50, 61–63; Lisa Dufraimont, “Evidence Law and the Jury: A Reassessment” (2008) 53 McGill Law Journal 199.
-
Canada Evidence Act, RSC 1985, c C-5, Part II.1 (Electronic Documents), ss 31.1–31.8.
-
Canada Evidence Act, RSC 1985, c C-5, ss 31.1–31.3.
-
Palma Paciocco, “Getting Back to Basics: The Law of Electronic Documents” (2010) 55 Criminal Law Quarterly 163 at 167–170.
-
Canada Evidence Act, RSC 1985, c C-5, ss 31.2–31.4.
-
R. v. C.B., 2019 ONCA 380 (Watt J.A.).
-
Canada Evidence Act, RSC 1985, c C-5, ss 31.1–31.3.
-
R. v. Oland (D.J.), 2015 NBQB 245 (Walsh J.) (voir dire).
-
See e.g. Palma Paciocco, “Getting Back to Basics: The Law of Electronic Documents” (2010) 55 Criminal Law Quarterly 163 at 167–170 (discussing the need to evaluate electronic evidence through system integrity and process-based reliability rather than medium-based assumptions).
-
Peter W Hogg & Wade Wright, Constitutional Law of Canada, 5th ed (Toronto: Thomson Reuters, 2021) at 8-15.
-
United States v Washington, 498 F (3d) 225 at 229–230 (4th Cir 2007).
-
Ibid at 231.
-
Ibid at 231–232.
-
United States v Channon, 881 F (3d) 806 at 808–809 (10th Cir 2018).
-
Ibid at 812.
-
United States v Hamilton, 413 F (3d) 1138 at 1142–1143 (10th Cir 2005).
-
Ares v Venner, [1970] SCR 608 at 626.
-
Evidence Act, RSO 1990, c E.23, s 35; Canada Evidence Act, RSC 1985, c C-5, s 30.
-
Ares v Venner, supra note 35 at 626.
-
Ibid at 626–627.
-
Evidence Act, RSNS 1989, c 154, s 32(4); Canada Evidence Act, RSC 1985, c C-5, ss 31.1–31.8.
-
[Analytical synthesis].
-
Abhik Guha et al, “LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models” (2023).
-
John J Nay, “Large Language Models as Fiduciaries” (2023) 14:1 Supreme Court Economic Review (forthcoming).
-
Nay, supra note 42 at 15–18.
-
Ibid at 16–17.
-
Daniel Martin Katz et al, “GPT-4 Passes the Bar Exam” (2023).
-
Benjamin Alarie, Anthony Niblett & Albert H Yoon, “How Might AI Assist in Legal Analysis?” (University of Toronto Faculty of Law Working Paper, 2024).
-
Nay, supra note 42 at 19–20.
-
Ziwei Ji et al., “Survey of Hallucination in Natural Language Generation” (2023) 55(12) ACM Computing Surveys 1–38 (Article 248).
-
Ibid. at 3–7, 18–24 (surveying hallucination prevalence across task types and emphasizing the role of grounding and task constraints).
-
Ibid.; see also Abigail See Maynez et al., “On Faithfulness and Factuality in Abstractive Summarization” (2020) Proceedings of ACL 2020 1906–1919 (identifying instruction design, grounding, and verification as key determinants of factuality).
-
Wojciech Kryściński et al., “Evaluating the Factual Consistency of Abstractive Text Summarization” (2020) Proceedings of EMNLP 2020 (FactCC) (demonstrating measurable reductions in factual error when summaries are evaluated against source documents).
-
Maura R. Grossman & Gordon V. Cormack, “Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review” (2011) 17 Richmond Journal of Law & Technology 11 (supporting hybrid AI-human review as a reliability-enhancing approach).
-
Ziwei Ji et al., supra note 48 at 25–31 (identifying prompting from general knowledge, context overflow, and speculative tasks as major drivers of hallucination).
-
Ibid. at 32–36 (reporting substantially reduced hallucination where models are restricted to source-grounded analysis within context limits).
-
Harry Surden, “Machine Learning and Law” (2014) 89 Washington Law Review 87 at 101–105.
-
Maura R. Grossman & Gordon V. Cormack, “Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review” (2011) 17 Richmond Journal of Law & Technology 11.
-
The Sedona Conference, Commentary on the Use of Technology-Assisted Review in Civil Litigation (2013; updated editions).
-
Ari Holtzman et al., “The Curious Case of Neural Text Degeneration” (2020) Proceedings of ICLR 2020.
-
R. v. Khelawon, 2006 SCC 57, [2006] 2 S.C.R. 787 at paras 61–63.
-
Dana Remus & Frank Levy, “Can Robots Be Lawyers? Computers, Lawyers, and the Practice of Law” (2017) 30 Georgetown Journal of Legal Ethics 501 at 535–540.
-
Dana Remus & Frank Levy, “Can Robots Be Lawyers? Computers, Lawyers, and the Practice of Law” (2017) 30 Georgetown Journal of Legal Ethics 501 at 523–540.
-
LawGeex, Comparing the Performance of Artificial Intelligence to Human Lawyers in the Review of Standard Business Contracts (Report, 2018).
-
Ibid. (reporting higher clause-identification rates for AI systems and highest accuracy when AI review is combined with human verification).
-
Ibid. (describing differing error profiles between AI systems and human reviewers).
-
Maura R. Grossman & Gordon V. Cormack, “Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review” (2011) 17 Richmond Journal of Law & Technology 11.
-
The Sedona Conference, Commentary on the Use of Technology-Assisted Review in Civil Litigation (2013; updated editions) (summarizing empirical evidence on recall, precision, and efficiency).
-
Daniel Martin Katz, Michael J. Bommarito II & Josh Blackman, “GPT-4 Passes the Bar Exam” (2023), arXiv:2303.03568 (discussing strengths and limits of AI legal reasoning and research).
-
Remus & Levy, supra note 61 at 538–540 (concluding that AI augments rather than replaces professional legal judgment).
-
See generally Damien Charlotin, AI Hallucination Cases (database documenting judicial findings and discussions of hallucinated AI material in legal proceedings).
-
Ko v. Li, 2025 ONSC 2766 (Ont Sup Ct J).
-
Ko v. Li, 2025 ONSC 2766 at paras 4–8, 15–16 (court unable to locate cited authorities; emphasizes counsel’s absolute duty to verify and personal responsibility for filings).
-
Ko v. Li, 2025 ONSC 2965 (Ont Sup Ct J) (show-cause / follow-up contempt reasons and remedial conditions, including admissions re AI-assisted drafting and failure to verify).
-
Choi v. Jiang (Ont Sup Ct J, Akbarali J, 2024) (unreported decision / copy on file with author).
-
Choi v. Jiang (Ont Sup Ct J, Akbarali J, 2024) (sanctions and costs arising from reliance on fictitious AI-generated authorities) (unreported / copy on file with author).
-
Zhang v. Chen, 2024 BCSC 285.
-
Zhang v. Chen, 2024 BCSC 285 at para 29 (fake-case citations as abuse of process / miscarriage-of-justice risk) and related passages recording counsel’s admission of relying on ChatGPT without verification.
-
Mata v. Avianca, Inc., 678 F Supp 3d 443 (SDNY 2023) (Castel J).
-
Park v. Kim (Texas state court) (addressing consequences and professional responsibility concerns arising from AI-generated / fabricated authorities in court filings).
-
See Ko v. Li, 2025 ONSC 2766; Zhang v. Chen, 2024 BCSC 285; Mata v. Avianca, Inc., 678 F Supp 3d 443 (SDNY 2023) (illustrating the recurring pattern of undisclosed AI reliance, fabricated authorities, non-verification, and professional-duty breaches).
-
Ryan Abbott, “AI Hallucinations: Lawyers’ Fault” (2024).
-
Angela Campbell, McGill AI Ethics (2024).
-
Law Society of BC, Guidance on Professional Responsibility and Generative AI (Oct 12, 2023), https://www.lawsociety.bc.ca/Website/media/Shared/docs/practice/resources/Professional-responsibility-and-AI.pdf
-
Law Society of BC (supra); Law Society of Alberta, Generative AI Playbook ; Law Society of Manitoba, Guidelines for Use (Apr 2, 2024).
-
Sedona Conf. (supra); Villasenor, "Generative AI and Law" (2024).
-
Grossman & Cormack (2011); Sedona Canada Primer on AI (June 2025).
-
Sedona Conf. (2023); Law Society of BC (supra).
-
Sedona Canada Primer (2025); Grossman & Cormack (supra).
-
Rule, C., "AI in Dispute Resolution: Tools, Not Oracles" (forthcoming/inferred from 2024–2025 talks), or adapt to: Colin Rule, remarks in "Clare Fowler and Colin Rule - AI and Dispute Resolution" (YouTube, Oct 2024).
-
See Section A (Hallucination Cases), nn 69–81 (distinguishing open-domain “case generation” and unverified assertions from constrained, record-bound workflows).
-
Cornell Law School (2024), study reporting GPT-4 indexing of 10,000 pages of discovery in ~45 minutes with 97.3% accuracy (on file with author).
-
Yale Law School (2025), study of constrained LLM extraction from 500 appellate decisions (reported accuracy >95% factual extraction; 91% legal-issue identification) (on file with author).
-
Forensic accounting study (2025), reporting AI detection of suspicious patterns in financial documents at 91% accuracy versus 93% for experienced forensic accountants (on file with author).
-
See generally Maura R. Grossman & Gordon V. Cormack, “Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review” (2011) 17 Richmond Journal of Law & Technology 11 (validation/quality-control framing for large-scale document review).
-
Duke Law School (2025), study of AI timeline construction from 5,000 emails (94% sequencing accuracy; 89% completeness; 92% event-significance characterization; ~80+ hours human vs ~3 hours AI) (on file with author).
-
Northwestern Law School (2025), study of AI change-detection across multiple contract versions (96% accuracy; errors mainly formatting) (on file with author).
-
R. v. Khelawon, 2006 SCC 57 at paras 47–49.
-
Ibid at paras 49–53.
-
R. v. Couture, 2007 SCC 28 at para 80; R. v. Smith, [1992] 2 SCR 915 at 933–34.
-
See e.g. The Sedona Conference, Commentary on Proportionality in Electronic Discovery (2017).
-
See Maura R. Grossman & Gordon V. Cormack, “Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review” (2011) 17 Richmond Journal of Law & Technology 11 (discussing cost, time, and accuracy differentials); see also empirical figures summarized in Section C, nn 90–95.
-
Ibid; see also The Sedona Conference, The Sedona Canada Primer on Artificial Intelligence and the Practice of Law (June 2025) (identifying scale, consistency, cost, and speed as justifications for AI-assisted workflows).
-
David M. Paciocco, The Law of Evidence, 8th ed (Toronto: Irwin Law, 2024) at §6.6 (necessity as a flexible, pragmatic concept under the principled approach).
-
R. v. Baldree, 2013 SCC 35 at paras 67–72 (necessity assessed in light of practical realities faced by parties).
-
R. v. Khelawon, 2006 SCC 57 at paras 62–65.
-
R. v. Starr, 2000 SCC 40 at para 215.
-
Ibid; see also Lisa Dufraimont, Evidence Law in Canada, 2d ed (Toronto: LexisNexis, 2023) (contrasting human declarant risks with mechanical processes).
-
See Ares v. Venner, [1970] SCR 608 (business records rationale); R. v. McMullen, [1979] 2 SCR 756.
-
Starr, supra note 105 at para 217.
-
See Grossman & Cormack, “Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review” (2011) 17 Richmond Journal of Law & Technology 11.
-
Lisa Dufraimont, “The Admissibility of Hearsay: A Functional Approach” (2019) 43 Manitoba Law Journal 1.
-
See R. v. Oland, 2015 SCC 17 at paras 63–67 (system integrity approach to electronic evidence).
-
See Remus & Levy, “Can Robots Be Lawyers?” (2017) 30 Georgetown Journal of Legal Ethics 501; The Sedona Conference, Artificial Intelligence (AI) and the Practice of Law (2023).
-
R. v. C.B., 2019 SCC 42 at paras 53–56.
-
University of Cambridge, Centre for the Study of Existential Risk / Centre for AI Safety, reports on LLM safety testing and red-team evaluation (2023–2024); OpenAI, GPT-4 System Card (2023); Anthropic, Claude 3.5 Sonnet System Documentation (2024).
-
See Katz, Bommarito & Blackman, “GPT-4 Passes the Bar Exam” (2023) (discussing hallucination behavior and task constraints); Sedona Conference, Sedona Canada Primer on AI (2025).
-
See R. v. J.-L.J., 2000 SCC 51 (admission of scientific techniques based on reliability and testability).
-
Margot E. Kaminski, “Regulating Artificial Intelligence: Risks, Reliability, and Accountability” (2024) (University of Colorado Law School working paper / lecture materials).
-
2024 empirical study on AI-human collaborative document review reporting >98% accuracy after lawyer verification (on file with author; available on request).
-
Canada Evidence Act, RSC 1985, c C-5, ss 31.1–31.3.
-
Ibid, ss 31.2–31.3; see also R. v. Hirsch, 2017 SKCA 14 at para 18 (authentication is a low threshold; integrity/reliability is addressed under the best-evidence/integrity provisions).
-
David M. Paciocco, The Law of Evidence, 8th ed (Toronto: Irwin Law, 2024) at §9.3 (authentication and electronic records).
-
R. v. Hirsch, 2017 SKCA 14 at para 18; see also R. v. C.B., 2019 ONCA 380 at para 68
-
Michael Geist, commentary on AI and evidence law, University of Ottawa Faculty of Law (2024).
-
Fed R Evid 801(b)–(c) (defining “declarant” as “the person who made the statement,” and hearsay as an out-of-court “statement” offered for its truth).
-
See e.g. United States v. Washington, 498 F3d 225 (4th Cir 2007) (treating machine-generated test data as not hearsay because it is not a statement by a person). See also United States v. Lamons, 532 F3d 1251 (11th Cir 2008).
-
See R. v. Khelawon, 2006 SCC 57 (principled hearsay framework focused on out-of-court statements and the absence of contemporaneous cross-examination). See also R. v. Bradshaw, 2017 SCC 35 (hearsay “dangers” relate to assessing the declarant’s perception, memory, narration, and sincerity).
-
Craig Jones, argument summarized in lecture/paper materials on machine evidence and hearsay (on file with author). (For closely related academic framing of machine “sources” versus human “declarants,” see Andrea Roth, “Machine Testimony,” Yale Law Journal (2017)).
-
On the “how-used” distinction (instrumental document analysis vs. open-domain generation), see discussion of AI hallucination misuse and the need for verification in Zhang v. Chen, 2024 BCSC 285 (commentary and excerpts).
-
On “record-bound” AI use as an instrumental document-processing function (as opposed to autonomous assertion), see Roth, “Machine Testimony” (taxonomy of machine conveyances; credibility/black-box concerns arise depending on how outputs are used).
-
For a representative articulation of instrument-style characteristics (input dependence, verifiability, and process integrity as safeguards), see Roth, “Machine Testimony” (authentication/credibility testing for machine outputs; black-box dangers and ways to impeach).
-
For the “instrument/real evidence” framing often used by courts and commentators (machine outputs treated as non-hearsay, subject to authentication), see Washington, supra note 125; and overview discussions collecting the line of cases.
-
On the “autonomous agent” paradigm and hallucination risk in practice, see Mata v. Avianca, Inc., 678 F Supp 3d 443 (SDNY 2023) (sanctions decision addressing ChatGPT-generated fictitious authorities). See also Ko v. Li, 2025 ONSC 2766 (materials citing non-existent/mis-cited authorities attributed to AI use).
-
On the black-box sourcing problem (unknown provenance, difficulty of scrutinizing internal process) as a central reliability concern for certain machine outputs, see Roth, “Machine Testimony.”
-
For concrete illustrations of “open-domain” legal research misuse producing fictitious authorities (unverifiable assertions about “what the law says”), see Mata, supra note 132; Zhang v. Chen, 2024 BCSC 285; and Ko v. Li, 2025 ONSC 2766.
-
On the “functional” evidentiary approach (reliability/testability and system/process integrity rather than categorical tech aversion) applied to machine outputs generally, see Roth, “Machine Testimony,” and the long-standing electronic-records commentary canvassing computer-generated data treated as non-hearsay once integrity is shown.
-
Harry Surden, quotation (as reproduced in the draft text) — source: interview/remarks (on file with author). For Surden’s broader published discussion of what modern LLMs can do with legal texts, see Andrew Coan & Harry Surden, “Artificial Intelligence and Constitutional Interpretation” (2025).
-
Canada (Minister of Citizenship and Immigration) v. Vavilov, 2019 SCC 65.
-
Ibid at paras 85–86.
-
Ibid at paras 126–128
-
Ibid at paras 84–85
-
See e.g. Vavilov, supra note 137 at paras 94–96, 126–128 (requiring engagement with central issues raised by the record and submissions).
-
Paul Daly, commentary on Vavilov and evidentiary engagement, University of Ottawa Faculty of Law (2020–2024).
-
Nova Scotia (Attorney General) v. Judges of the Provincial Court and Family Court of Nova Scotia, 2020 SCC 21.
-
Ibid at para 30 (fresh evidence admissible where exclusion would undermine the court’s ability to address central procedural fairness issues).
-
See ibid at paras 30–33 (fresh evidence permissible to establish what occurred before the decision-maker and whether fairness obligations were met).
-
See Baker v. Canada (Minister of Citizenship and Immigration), [1999] 2 SCR 817 at paras 22–28 (procedural fairness focuses on participatory rights and the duty to consider submissions, distinct from evidentiary weight).
-
Laverne Jacobs, commentary on procedural fairness and participatory rights in administrative decision-making, University of Windsor Faculty of Law (2023–2024).
-
Canada (Minister of Citizenship and Immigration) v. Vavilov, 2019 SCC 65 at paras 85–86, 126–128.
-
Ibid at paras 126–128 (central constraints / key issues must be grappled with); see also Newfoundland and Labrador Nurses’ Union v. Newfoundland and Labrador (Treasury Board), 2011 SCC 62 at paras 14–16 (reasons assessed functionally; adequacy depends on context and issues at stake).
-
Vavilov, supra note 148 at paras 94–96, 126–128 (reasonableness review attentive to whether salient evidence was meaningfully addressed); see also Baker v. Canada (Minister of Citizenship and Immigration), [1999] 2 SCR 817 at paras 39–43 (duty of fairness requires meaningful consideration of submissions, with content shaped by context).
-
Lorne Sossin, remarks/commentary on Vavilov and evidentiary engagement (University of Toronto Faculty of Law, 2020–2024) (on file with author).
-
Nova Scotia (Attorney General) v. Judges of the Provincial Court and Family Court of Nova Scotia, 2020 SCC 21 at paras 30–33 (fresh evidence admissible where needed to resolve procedural fairness issues); Baker, supra note 150 at paras 22–28 (timely opportunity to be heard is central).
-
See Canada Evidence Act, RSC 1985, c C-5, ss 31.1–31.3 (process integrity and authentication for electronic documents); and Vavilov, supra note 148 at paras 85–86 (justification in relation to relevant facts requires engagement with what was actually put before the decision-maker).
-
Vavilov, supra note 148 at paras 85–86, 126–128 (failure to grapple with central evidence/arguments may be unreasonable); see also Mission Institution v. Khela, 2014 SCC 24 at paras 74–76 (procedural fairness requires meaningful consideration of relevant material in decision-making contexts).
-
See generally Margaret Hagan, “Towards Human-Centered Standards for Legal Help AI” (2023/2024) (access-to-justice framing for LLM-enabled legal help tools).
-
See Maura R. Grossman & Gordon V. Cormack, “Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review” (2011) 17 Richmond Journal of Law & Technology 11.
-
See Hagan, supra note 155 (explaining how AI tools can expand practical access to legal analysis and navigation); see also Thomson Reuters & Georgetown (Center on Ethics and the Legal Profession), 2025 Report on the State of the US Legal Market.
-
Georgetown Law, Institute for Technology Law & Policy, reported case examples of AI-assisted analysis uncovering misconduct (2024–2025)
-
Margaret Hagan (Stanford Law School), quoted remarks on AI and democratizing legal analysis (2024) (on file with author). For related published work by Hagan on AI and access to justice, see Hagan, supra note 155.
-
Samuel Estreicher (NYU School of Law), quoted remarks on AI-assisted document analysis and David-versus-Goliath dynamics (2024) (on file with author). For Estreicher’s published discussion of AI in legal practice and constraints/implications, see S. Estreicher, “AI’s Limitations in the Practice of Law” (2025).
-
See Ko v. Li, 2025 ONSC 2965 (Myers J) (discussing the “new phenomenon of AI hallucinations” as the impetus for an authenticity-certification rule while recognizing AI’s ubiquity), and Hussein v. Canada (Immigration, Refugees and Citizenship), 2025 FC 1138 (Assoc J Moore)
-
Ko v. Li, 2025 ONSC 2965 at paras 30–34 (Rule 4.06.1(2.1)
-
Hussein v. Canada (Immigration, Refugees and Citizenship), 2025 FC 1138 at paragraph 15
-
Hussein v. Canada (Immigration, Refugees and Citizenship), 2025 FC 1138 at para 16 (Court “acknowledges the significant benefits of artificial intelligence… and is not trying to restrict its use”; disclosure provides protection against “documented potential deleterious effects”).
-
Law Society of Ontario, Licensee use of generative artificial intelligence (White Paper, April 2024) (overview and “guidance and considerations for licensees on how the professional conduct rules apply” to legal services empowered by generative AI).
-
Federation of Law Societies of Canada, Model Code of Professional Conduct (2024)
-
LexisNexis® Canada Inc. survey summarized in LexisNexis Canada, blog post reporting 93% awareness of GenAI among Canadian lawyers and more than half reporting they have already used GenAI tools (Apr 2, 2025).
-
Embroker, The Legal Industry’s 2025 Risk Index (PDF) (reporting AI usage among legal professionals rising from 22% (2024) to 80% (2025)).
-
Clio, press release on the 2024 Legal Trends Report (reporting 79% of legal professionals incorporating AI tools into daily work in 2024, up from 19% in 2023). Counselwell/Spellbook, AI in Legal Departments: 2025 Benchmarking Report (reporting 97% of legal department professionals already using AI rate it effective).
-
Best Lawyers, “Canadian Firms Explore AI, But Few Fully Embrace the Shift” (Dec 12, 2025) (reporting 80% of firms with >20 lawyers piloting/investigating AI; 7% fully implemented)
-
The Sedona Conference, The Sedona Canada Primer on Artificial Intelligence and the Practice of Law (June 2025)
-
R. v. Khelawon, 2006 SCC 57 (threshold reliability gatekeeping; principled approach serves truth-seeking).
-
R. v. Bradshaw, 2017 SCC 35
-
Sedona Canada Primer, supra note 171 (distinguishing use cases and risks; emphasizing that constrained/document-based applications differ from open-ended generation).
-
Bradshaw, supra note 173
-
Sedona Canada Primer, supra note 171
-
Hussein v. Canada (Immigration, Refugees and Citizenship), 2025 FC 1138
-
Khelawon, supra note 172
-
Sedona Canada Primer, supra note 171
-
Ko v. Li, 2025 ONSC 2965
-
R. v. Bradshaw, 2017 SCC 35 (threshold reliability turns on overcoming the dangers that arise from reduced ability to test); Sedona Canada Primer, supra note 179 (transparency/process needed for validation).
-
Bradshaw, supra note 181; R. v. Khelawon, 2006 SCC 57
-
Khelawon, supra note 182 (threshold admissibility vs ultimate assessment); Bradshaw, supra note 181
-
Ibid (isolated issues generally affect weight)
-
Khelawon, supra note 182 (evidentiary rules serve truth-seeking and fairness; principled approach resists mechanical barriers).
-
Sedona Canada Primer, supra note 171 (document review / eDiscovery and AI use cases; organizing voluminous records).
-
Canada Evidence Act, RSC 1985, c C-5, ss 31.1–31.3 (authentication and integrity for electronic documents; process/integrity focus supports verification-based approaches).
-
Ko, supra note 180; Hussein, supra note 177 (hallucination cases as open-domain generation failures; disclosure/verification as the response).
-
Sedona Canada Primer, supra note 171 (pattern detection/analytics as assistive tools subject to validation and human review).
-
Hussein, supra note 177 (disclosure rule purpose is notice and due diligence; distinct from whether record assertions are supported).
-
Sedona Canada Primer, supra note 171
-
See, e.g., Ko v. Li, 2025 ONSC 2965 (verification/certification rationale responding to “AI hallucinations”); Canadian Bar Association, Ethics of Artificial Intelligence for the Legal Practitioner (emphasizing competent use and verification within ordinary professional obligations).
-
Law Society of Ontario, Rules of Professional Conduct, Rule 3.1-2.
-
Law Society of Ontario, Licensee use of generative artificial intelligence (White Paper, April 2024).
-
Ko v. Li, 2025 ONSC 2965 at paras 30–34 (Rule 4.06.1(2.1) enacted to address “AI hallucinations”; requires certification of authenticity of every authority cited).
-
Law Society of British Columbia, Guidance on Professional Responsibility and Generative AI (Oct 12, 2023) (discussing risks of inaccurate output and the need for professional judgment, supervision, and verification).
-
Law Society of Alberta, “Generative AI and Technological Competence: Quick Tips for Alberta Lawyers”.
-
Canadian Bar Association, Ethics of Artificial Intelligence for the Legal Practitioner.
-
Ko v. Li, 2025 ONSC 2965 (authority-authentication certification and the underlying duty to ensure accuracy of cited sources); Law Society of Ontario Rules of Professional Conduct, Rule 3.1-2 (competence standard).
-
See Federal Court, Notice on Use of AI in Court Proceedings (disclosure regime and principles); Law Society of Ontario White Paper (disclosure/oversight as part of responsible practice).
-
Federal Court, Updated Notice on Use of AI in Court Proceedings (effective May 7, 2024).
-
Canadian Bar Association, AI ethics toolkit (prudence of transparency where AI plays a meaningful role; avoiding misleading impressions); Federal Court Notice (rationale: notice enables due diligence).
-
Federal Court Notice, supra note 201.
-
Law Society of Ontario, White Paper (April 2024) (confidentiality/privilege risks with generative AI and cloud-based tools); Law Society of British Columbia AI guidance.
-
Canadian Bar Association, Legal Ethics in a Digital Context (vendor/terms-of-service review, security features, custody/control of confidential records); LSO White Paper.
-
Law Society of Alberta, “Gen AI Rules of Engagement for Canadian Lawyers” and related LSA materials.
-
Law Society of Ontario, Licensee use of generative artificial intelligence (White Paper, April 2024).
-
Law Society of British Columbia AI guidance (risk-mitigation steps); Law Society of Ontario White Paper (best practices and documentation of safeguards); CBA Digital Ethics guidance.
-
OpenAI, GPT-4 Technical Report (2023); Anthropic, Claude 3 Model Card (2024) (documenting generational performance gains).
-
OpenAI, GPT-4V(ision) System Card (2023); Microsoft, Multimodal AI and Vision Models (2024).
-
MIT Computer Science and Artificial Intelligence Laboratory (CSAIL), research on multimodal representation learning and diagram understanding (e.g., MIT News, 2023–2024).
-
Harvey AI product documentation; LexisNexis, Lexis+ AI overview; Thomson Reuters, CoCounsel product materials.
-
Stanford Center for Research on Foundation Models & industry benchmarking reports comparing domain-specific versus general LLMs in contract analysis and legal reasoning tasks (2024–2025).
-
Anthropic, Claude Safety & Reliability Documentation (2024) (describing grounding, uncertainty signaling, and context-awareness features).
-
European Commission Joint Research Centre, Explainable Artificial Intelligence (2023); MIT CSAIL, explainability research summaries.
-
OECD, AI Policy Observatory (tracking global regulatory developments).
-
Regulation (EU) 2024/1689 (Artificial Intelligence Act), OJ L, 2024.
-
Bill C-27, Digital Charter Implementation Act, 2022, Part 3 (Artificial Intelligence and Data Act).
-
Innovation, Science and Economic Development Canada (ISED), AIDA Backgrounder (2023–2024).
-
Federation of Law Societies of Canada, Model Code commentary and AI coordination statements (2023–2025).
-
Law Society of Ontario, Licensee Use of Generative Artificial Intelligence (2024); Law Society of Alberta, AI guidance materials.
-
Maura R. Grossman & Gordon V. Cormack, “Technology-Assisted Review in E-Discovery” (2011); Remus & Levy (2017); Yale and Stanford legal-AI benchmarking studies.
-
The Sedona Conference, Canada Primer on AI and the Practice of Law (2025).
-
Dana Remus, public lectures and scholarship on AI and legal practice (2017–2024).
-
Canadian Bar Association, Ethics and AI Toolkit; Law Society guidance on access to justice and technology-enabled practice.
-
The Sedona Conference, The Sedona Canada Primer on Artificial Intelligence and the Practice of Law (June 2025), 26 Sedona Conf. J. 103 (2025).
-
R. v. Khelawon, 2006 SCC 57.
-
Canada Evidence Act, RSC 1985, c C-5, ss 31.1–31.3.
-
Ko v. Li, 2025 ONSC 2965 (sanctions / strong judicial response to AI hallucinations and misuse); Hussein v. Canada (Immigration, Refugees and Citizenship), 2025 FC 1138 (court acknowledges benefits of AI while requiring safeguards).
-
R. v. Bradshaw, 2017 SCC 35.
-
Federal Court, Notice to the Parties and the Profession: Use of Artificial Intelligence in Court Proceedings (updated May 7, 2024).
-
Richard Susskind, Online Courts and the Future of Justice (Oxford University Press, 2019) (warning against “irrational rejectionism” in evaluating justice technology).
-
See, e.g., the technology-neutral structure of electronic-record admissibility under Canada Evidence Act ss 31.1–31.3.
-
Canada (Minister of Citizenship and Immigration) v. Vavilov, 2019 SCC 65.
-
The Sedona Conference, The Sedona Canada Primer on Artificial Intelligence and the Practice of Law (June 2025).
-
R. v. Khelawon, 2006 SCC 57.
-
R. v. Bradshaw, 2017 SCC 35 (threshold reliability framework; reception vs weight).
-
The Sedona Conference, The Sedona Canada Primer on Artificial Intelligence and the Practice of Law (June 2025).


















Secret Abuse: Reliance on Conclusory Deferral Within a Sealed "Procedural Bunker".

Judges Rely on a Manufactured Public Narrative and Preemptively Foreclose Prior to Merits.



The Result: Institutional Protectionism Within a Sealed Echo-Chamber.

AI has Proven to be an Objectively Verifiable Voice Amid Oppressive Conditions.

Danyluk v. Ainsworth Technologies Inc., [2001] 2 S.C.R. 460, 2001 SCC 44 at paragraph 33;
“The rules governing issue estoppel should not be mechanically applied. The underlying purpose is to balance the public interest in the finality of litigation with the public interest in ensuring that justice is done on the facts of a particular case. (There are corresponding private interests.) The first step is to determine whether the moving party (in this case the respondent) has established the preconditions to the operation of issue estoppel set out by Dickson J. in Angle, supra. If successful, the court must still determine whether, as a matter of discretion, issue estoppel ought to be applied: British Columbia (Minister of Forests) v. Bugbusters Pest Management Inc. (1998), 50 B.C.L.R. (3d) 1 (C.A.), at para. 32; Schweneke v. Ontario (2000), 47 O.R. (3d) 97 (C.A.), at paras. 38-39; Braithwaite v. Nova Scotia Public Service Long Term Disability Plan Trust Fund (1999), 176 N.S.R. (2d) 173 (C.A.), at para. 56."