7 Questions to Ask Before Giving a Vendor Access to Your Data Set In An Artificially Intelligent World
by: Tatyana Ruderman with contributions by: Dhara Shah
In 2024 we have seen many voices enter the chat on how to approach AI governance, globally and locally. One thing is clear: the time to begin addressing AI compliance is yesterday, and the considerations below are a good place to start.
To start thinking about what more you may need to do, we present you with 7 questions to ask before handing over access to your data set.
Question 1. What Are My Legal Obligations?
Regulations that specifically target artificial intelligence/machine-learning or certain automated decision-making technologies are quickly evolving, but use of these Big Data technologies is nothing new – and already regulated under existing laws. AI/ML technologies, including discriminatory AI, have long been used in analytics or similar internal business tools to help study and analyze data, make predictions, automate processes, provide fraud detection, and help overall improve a company’s products and services. There is no doubt that use of such AI tools provides great value to businesses.
However, under recent and new guidance and legal frameworks, certain tools/vendors may require more compliance before implementation, particularly for use of emerging AI/ML technologies like generative AI (genAI), including large language models (LLMs), and now large vision models (LVMs). As a very distilled list, ongoing, specific regulation of AI/ML focuses on requirements for AI governance (programs and documentations, risk assessments, training), transparency (notice to end users, labeling, documentation), accountability (registration, third party review), and individual rights (opt out/appeal, nondiscrimination).
Recent regulation includes:
US state AI Acts (such as in Colorado and Utah),
Use of AI tools to process personal data will trigger requirements under US and international comprehensive privacy laws, including transparency requirements, data minimization/retention standards, data access/deletion rights, etc. US state privacy laws already specifically govern certain aspects of AI. These state privacy laws currently exist in California, Colorado, Connecticut, Virginia, Utah, and Florida. And soon in Texas, Oregon, and Montana. Among other requirements, they require proper disclosures, consents, and opt-out rights for users when making AI-powered decisions that grant or deny financial or lending services, insurance, housing, health care services, employment, educational opportunities, or basic necessities. Certain states also impose additional requirements. For example, Colorado requires disclosure of: (1) the logic and training used to create the AI tool; (2) whether the AI tool has been evaluated for accuracy, fairness, and bias; and (3) why the AI tool must be used. California is proposing additional regulations that will likely place similar transparency requirements on businesses.
Examples of other official statements, advisories, and industry guidance include:
A joint statement by US federal agencies (the CFPB, DOJ, EEOC, and FTC) emphasizing that aspects of AI systems are governed by existing laws (such as laws governing discrimination, deception, and privacy),
Legal/industry group frameworks (such as from the World Economic Forum and CARU’s guidelines on AI-generated children advertisements and data collection, which we write about here
Taking advantage of the hottest/most exciting vendor solutions is very difficult to navigate as technologies evolve far more quickly than laws and few truly have knowledge of and understand exactly how each tool is built/how it works, and there is a lack of clear, cohesive legal guidance. Yet, companies that deploy AI tools are responsible not only for their own compliance, but also for ensuring their use of AI is ethical.
Question 2. Have I Double Checked Whether the Data Processing Involves Use Of AI/ML/Automated Decision-Making Technology?
Guidance around use of artificial intelligence/machine-learning or automated decision-making tools remains quite murky and in-flux. There is even a lack of consensus around the definitions of what is AI/ML. Because of how many nuances there are to this question, evaluating vendor tools must be done on a case by case basis.
As mentioned above, you may be surprised to find that many of your vendors do indeed use technologies or sub processors that integrate some degree of AI/ML or automated decision-making technology, based on how these are defined across various laws and guidance (such as tools that involve basic automation or analytics). Use of such AI tools is also growing in areas where automated decision-making allows for more efficiency, such as entering or analyzing data, providing customer service, and managing inventory.
In evaluating a vendor, it will be very important to consider how the vendor views themselves and their services – and bear in mind that a vendor’s silence on the topic speaks volumes. To evaluate this, ask:
Does the vendor represent to have a robust compliance program in place? Certain vendors acknowledge and address their use of AI tools on their websites, in technical documentation, FAQs, contractual terms, etc., which may help assure you they are well aware of the legal and regulatory frameworks governing their services and compliance obligations.
Are references to AI/ML or automated technologies buried in the contract or technical documentation, or not acknowledged at all? For others, it may take more digging to understand how their solution works and whether it involves use of AI/ML/automated decision learning, and even more work to get on the same page about how to delineate contractual obligations.
Question 3. Is the Vendor’s Tool Subject to Special Regulation?
Depending on the specific context, the use of AI/ML or automated decision-making tools alone may not rise to the level where they now require additional compliance beyond what is required for your privacy and data security compliance – this very much depends on the very specific context of how the tool is used.
At a very high level, additional AI-specific regulation will mostly impact AI solutions that do more than simply automate – you will want to pay extra attention whenever individual data is involved to train AI, particularly that which is “sensitive” (such as health data, biometrics, precise location), where there is a type of processing considered more “high risk”, or where the tool makes consequential/significant decisions (such as making employment or financial decisions).
Different rules and degree of regulation will apply depending on: whether you are a developer (e.g. vendor or business building the AI/ML tool), deployer (e.g. the business implementing a tool in its product), or user of AI (e.g. personnel), what industry you are in (for example, several state bar associations (such as California and Florida) have issued guidance to attorneys on how to reasonably use AI in accordance with professional responsibility requirements), and of course, what jurisdictions you are in.
Question 4. What Data Is Being Inputted Into The AI/ML System and How Much Of It?
Personal Data & Consent. Data used to train AI systems must be collected and processed in compliance with all laws. To ensure this, you’ll need to understand exactly what data types will be input to the AI tool. Depending on the context, direct consent from users may be needed. If sensitive data (including certain demographic data, biometrics, health data, children’s data, or precise geolocation), or potentially sensitive (like photos/videos) is used this will require a higher level of notice and consent as there is a focus on enforcement surrounding “high risk processing activities” (and you may also need to consider compliance under other existing frameworks, such as for biometrics or children’s data). If only “anonymous” data is being used, make sure that means it is truly anonymous, subject to the strictest applicable standards and that it would not otherwise be possible for the AI tool to re-identify an individual based on patterns gleaned from other data types.
Data Minimization/Retention Requirements. The amount of data processed within the tool must align with data minimization and limitation principles and must be deleted in accordance with retention schedules and policies. This requires a careful and critical inquiry, and documentation, to assess the minimum amount of data reasonably necessary to provide the service (to avoid other potential harms associated with AI like bias, this will require a lot of data) – note data minimization standards are becoming even stricter under certain forthcoming US states laws such as Maryland.
Question 5. What Data Comes Out of the AI/ML Process?
Consequential/Significant Decisions. It will be key to understand at a baseline how the tool produces outputs – in particular where the output involves consequential/significant decisions impacting an individual – in order to transparently explain this to end users (for example, if required, to explain the logic behind how the AI system arrives at its conclusion), to feel confident that the tool works as intended (for example, that it accurately makes its predictions), among many other considerations.
Privacy Exposing Inferences. AI-powered outputs become data you hold, and will be subject to your existing privacy and data security obligations. You will want to understand what data you can expect to receive from the AI tool. Is the output data more sensitive than the input data (for example, inferences drawn about a person’s movements or activities that reveal health or mental status)? Is the data adequately protected from unauthorized access? Would you want your customers to know you hold this kind of data (for example, if you are required to produce it in response to an access request)?
Question 6. Can The Vendor Provide Proper Assistance to Help You Comply With Existing and Coming Laws And Regulatory Compliance Requirements That Govern Your Business?
Depending on the specific context, you’ll need to make sure yours (or the vendors’) standard contractual terms and privacy/data security attachments adequately address use of the tool, including any applicable privacy or data security requirements or AI-specific regulations.
It will be essential to evaluate exactly how the vendor will provide assistance in meeting your privacy and data security compliance obligations, for example, to handle a downstream consumer deletion request or to opt out of automated decision-making. The very nature of AI is at odds with certain privacy requirements and poses novel issues (for example, the requirement to minimize data is at tension with AI’s insatiable need for data to train its algorithms, and properly deleting data presents a challenge when that data has already been used to train AI).
You’ll want assurance that your vendor will be a helpful and forthcoming partner in helping you navigate these complicated issues.
Question 7. What Are the Potential Harms to Your Business And End Users? Does the Use of This Technology Align With Business Branding and Strategy?
Deploying certain new technologies that involve AI/ML or automated decision-making can pose fairly significant risks to your business, or potential harm to end users and/or society. Before engaging a vendor, you’ll want to generally assess whether the benefit of the tool is worth the potential risks, and consider what steps need to be taken to protect against core AI risks and harms, such as to avoid algorithmic bias and discriminatory impact in the AI outputs. Mitigate some of these risks by seeking input from multiple stakeholders to ensure use of the tool aligns with business goals, strategy, and branding. This should include product owners, procurement, IT/security, information systems, legal, marketing, and any other key roles – and be sure to document this input and how it was addressed.
As a few examples of potential enforcement, the FTC has pursued several cases against companies for alleged unlawful use of AI, with penalties including algorithmic disgorgement (such as this settlement with Rite Aid, and AI has been the subject of various class actions (for example, a proposed class action against Home Depot and Google). Texas has released a statement indicating it will aggressively enforce its consumer protection laws (including its privacy law that governs AI). Maintaining documentation of the involvement of key stakeholders will help offset some of these risks – to avoid or mitigate such investigation or enforcement by being able to demonstrate due diligence and responsible practices around use of these AI/ML and automated decision-making tools.
Originally published by InfoLawGroup LLP. If you would like to receive regular emails from us, in which we share updates and our take on current legal news, please subscribe to InfoLawGroup’s Insights HERE.