Optical governance: The Roles of Machine Vision in China’s Epidemic Response
Gabriele de Seta on the entanglement of social practices and automated sensing technologies.
Drawing on different use cases of machine vision in China’s epidemic response, Gabriele de Seta revisits James C. Scott’s concept of legibility and identifies a paradigm of “optical governance” emerging from the entanglement of social practices and automated sensing technologies.
Since the early days of the COVID-19 outbreak in Wuhan, machine vision has played a central role in the management of the novel coronavirus epidemic. From the haphazard social media monitoring of Hubei license plates to the repurposing of traffic surveillance cameras, existing media systems were repurposed for early response strategies. As the state kicked quarantine measures into gear, China’s tech industries responded to the state of emergency by offering their platforms to authorities and the public at large, deploying a wide array of optical tools and sensing systems ranging from temperature measurement and health diagnostics to self-driving vehicles and assistive robots. It is not surprising that machine vision is a privileged category of technological responses to an epidemic—when both the threat and its symptoms are invisible to the human eye, relying on non-human sensing promises an additional degree of agency and an extended layer of prevention.
***
A few months into the coronavirus crisis, I wrote some notes about how China’s epidemic management was enrolling machine vision technologies into something that I tentatively called—mashing up two classic theoretical templates—“seeing like a state of exception.” As the epidemic became a pandemic, states of exception fully settled into a new normality and, as we near the likely anniversary of COVID-19’s zoonotic jump, it becomes possible to think more comprehensively about the entanglement of vision machines and societies, identifying an emerging paradigm that I propose to call optical governance.
Leaving Agamben to undermine his own arguments regarding states of exception (he is doing that disappointingly well), I’ll instead revisit James C. Scott’s fortunate idea of “seeing like a state,” which continues to productively shape debates on the optics of sovereign power and governance. For Scott, a state sees like a simplifying machine, driven by a “high modernist” ideology and set on a quest to remedy the state’s own “blindness” by enforcing legibility: “officials took exceptionally complex, illegible, and local social practices, such as land tenure customs or naming customs, and created a standard grid whereby it could be centrally recorded and monitored” (1998, p. 2).
Legibility can be imposed through simplification on both society and nature, and “provides the capacity for large-scale social engineering” (p. 8). The precondition of legibility is a “narrowing of vision,” a tunneled field of view that “brings into sharp focus certain limited aspects of an otherwise far more complex and unwieldy reality” (p. 11). Legibility entails abstraction and loss of detail, a focusing on the proverbial trees of interest rather than on the forest in its agential complexity (p .13). Scott illustrates this process through a wide range of case studies, from forest management and population surveys to measurement standards and urban planning.
Ultimately, Scott’s state sees from a central and elevated point of view, a synoptic perspective that allows schematic rationalization (p. 79); state simplifications are selective and reductive, and create data that is abstracted, biased, and static (p. 80). This productive insight is also one of the most commonly criticized aspects of Scott’s argument: by buying into the state’s performative claim to verticality and encompassment (Ferguson & Gupta, 2002), he avoids any meaningful discussion of capitalism (Coronil, 2001), or at best he blankets high modernist ideology over entities as different as securitized enclaves and transnational corporations (Ferguson, 2005). Not everything, it turns out, sees exactly like a state—perhaps not even states themselves.
Tania Li has argued for the need to move beyond Scott’s account of state optics, expanding it through analyses of other actors and their “messy, contradictory, multilayered, and conjunctural effects” (2005, p. 384) on governance. This call has been answered by a proliferation of ways of seeing, and the city has emerged as the paradigmatic locus of complexity (Magnusson, 2011). Seeing like a city implies an “epistemologically hybrid approach to governing space” (Valverde, 2011, p. 277) that upends Scott’s synoptic perspective. The layering of digital infrastructures over and beyond urban space beckons new optics—seeing like big tech (Tréguer, 2019) with its hold over data flows and cloud borders; seeing like platforms (MacKenzie & Munster, 2019) that operationalize visuality itself; or seeing like algorithms (Uliasz, 2020) and their automation of governance.
It is tempting to speculate about what it means to be “seeing like” the latest sociotechnical facet added to the compound eyes of planetary computation, but I am more interested in evaluating the current relevance of Scott’s idea of legibility and the role it plays in the emerging paradigm of optical governance. China’s epidemic response, which by the summer of 2020 had allowed the country to largely contain COVID-19 and bounce back into economic recovery, is an interesting case study in optical governance precisely because of the degree to which machine vision technologies were enrolled into its strategy as symbolic and practical articulation joints between the invisibility of viral contagion and the visibility of its social effects, media representation, and state responses.
As Louise Amoore notes, computation itself is directly linked to the production of legibility: “cloud analytics visualize and render perceptible that which could never be observed directly, could not be brought into view as with an optical device” (2018, p. 19). The computational automation of optical technologies promises to make legible what lies beyond the visible, and this is precisely why machine vision—as a broad category of systems and applications—has become a central protagonist of epidemic solutions. In order to evaluate the current relevance of legibility and then sketch a definition of optical governance, I will briefly review three examples of how machine vision has been deployed in China during the epidemic, starting from the most basic interfacial use (QR codes), passing through performative securitization (temperature measurement), and ending with ambient automation (assistive robotics).
QR codes as machine-readable gateways
QR (Quick Response) codes are the most unassuming and perhaps most impactful use of machine vision for epidemic management. Debuted in Japan in 1994 by the DENSO corporation as a rapid and reliable improvement on UPC (Universal Product Code) barcodes, QR codes are machine-readable square matrix labels capable of storing different kinds of data with a high degree of redundancy. After fifteen years of use in industrial manufacturing as component tracking labels, social media platforms in Japan and Taiwan started offering QR codes as shortcuts for users to access content through their mobile device camera. Following suit, in the early 2010s Chinese platform companies integrated them into apps like WeChat and Alipay as machine-readable gateways for social networking, information retrieval, and monetary transactions.
For anyone familiar with the popularity of QR codes in China, their repurposing as epidemic data gateways is less than surprising. During the first weeks after the COVID-19 outbreak in Wuhan, public venues and residential communities relied on paperwork and color-coded permits to trace travel histories and regulate mobility. By February 2020, telecom operators started offering a more reliable tool by providing a “proof of itinerary” based on a user’s roaming history via SMS; but at the same time, Chinese tech companies were already at work on repurposing QR codes to integrate all these functions in one accessible and interoperable artifact that was both faster to verify and safer to handle.
This choice was not arbitrary: Tencent drew on experience with its residential community management platform Haina (now rebranded as Tencent Cloud Future Community) to deploy its “health code” in the city of Hefei on February 8; similarly, Alibaba launched its own version in Hangzhou on February 12. In a matter of days, the two companies expanded their solution to most provinces across the country, largely reproducing their current duopoly over the digital transaction market, and forcing state authorities to remedy to the proliferation of city-based health codes by demanding for interoperability across systems and pushing companies towards the formulation of a national standard.
The red, yellow, and green-colored QR codes have become perhaps the most iconic symbol of China’s pandemic response, and much of the debate around them has focused on their actual usefulness in staving off contagion, on their opaque and unreliable functioning, and on the “data islands” resulting from their fragmentation. But another interesting part of this story is the rapid standardization process undergone by this new use of the QR code: as early as March 5, the Shenzhen Standards Promotion Association and Tencent led the drafting of a national health code standard, and only twenty-five days after being deployed as a prototype, the Tencent health code was selected as a model for the nationwide use of these systems.
The nationwide adoption and standardization of health codes was premised on the existence of community management systems, the popularity of QR codes as digital gateways, and the well-documented enlisting of platform companies by the state. Different ways of seeing intersect in the black and white grid matrix of the QR code: the permit-based gaze of neighborhood checkpoints, the platform-based identification of users, and the interoperable transparency of nationwide standards. Legibility is distributed along the circulation of these square matrices: users delegate their identification to the personal information they already handed over to platforms like WeChat or Alipay, in exchange for a human-readable color code and a machine-readable epidemic status that grants mobility and access; tech companies compete to insure the smooth and widespread adoption of their proprietary systems, while the state channels private interests into standardization and reserves a right to the data collected with each code scan.
Temperature measurement as panspectral surveillance
The KC N901 Smart Helmet, launched in late February 2020 by Shenzhen-based tech company Kuang-Chi Technologies, exemplifies a radically different use of machine vision in epidemic management. Named after a Ming dynasty scientist, the Kuang-Chi group (of which Kuang-Chi Technologies is a subsidiary) was founded in 2010 by five returning graduates from Duke and Oxford Universities, and in 2012 it was the first Chinese tech company visited by Xi Jinping during his first inspection tour of Guangdong as a General Secretary of the CCP. Specializing in metamaterials, optical communication, and aerospace technologies, Kuang-Chi has consistently embraced state initiatives encouraging international investment and public-private partnerships, and is today a major example of China’s “civil-military fusion” strategy of dual-use technology development.
Promoted as the “first choice for epidemic prevention and control”, the KC N901 Smart Helmet combines a lightweight shock-absorbing material shell and military-grade protective goggles with a 300-nits, 35° degree field of view augmented reality display and a thermal camera. The infrared sensor, capable of measuring the temperature of two hundred people per minute with an accuracy of ±0.3°C in a “quick, unaware, and contactless” way, can be combined with a high-definition front camera for a variety of use modes, including single person and crowd temperature measurement, QR code scanning, license plate checks, and face recognition. According to Kuang-Chi, the envisioned application scenarios include non-contact traffic monitoring, frontline medical work, and epidemic prevention in public and private spaces.
Kuang-Chi clearly designed the KC N901 in response to the COVID-19 epidemic, drawing on its expertise in metamaterials and optical imaging to develop a device showcasing the company’s smart city ambitions. None of the technologies fitted in this polymer helmet are radical innovations: AR displays, high-resolution cameras, infrared sensors, and lithium batteries are all more or less common components of portable digital devices. What makes this “smart helmet” representative of a larger effort in optical power is its striking visuality and symbolic weight as a policing accessory capable of projecting the state’s epidemic monitoring effort while also hinting at an opaque capacity for augmented surveillance. It is not surprising that, as soon as it was delivered to police forces across China, the KC N901 became a media sensation paraded in city centers and transportation hubs.
In practical terms, it is by now well understood that temperature measurement has a limited impact on the prevention of COVID-19 outbreaks and that many thermal camera manufacturers utilize deceptive marketing, but these details are secondary: what technologies like this smart helmet promise is a platform with “panspectral” (Creemers, 2017, p. 12) surveillance potential. Thermal imaging at a distance is the public service front-end, while what matters to its users is the back-end triangulation of different spectrums of legibility: QR-encoded information, license plates, facial features, and so on. Previous trials of augmented reality policing in China have been reportedly met with little enthusiasm, but the COVID-19 epidemic has provided a scenario in which these kinds of devices become more acceptable (or even welcomed) technical solutions to the invisibility of the viral threat.
The success of the Kuang-Chi KC N901, which has now been distributed worldwide for prices upwards of US$5,000, exemplifies a specific optical position. Even if the helmet’s front camera is nested in a cavity clearly designed to frame the Chinese police force insignia, Kuang-Chi stresses that its face recognition and license plate verification capabilities are dependent on the selected use mode and the data provided by the customer. The legibility these helmets afford is highly dependent on the authority wielded by their user and the databases its systems draw upon, and might range from a simple temperature reading to a comprehensive triangulation of a person’s identity, travel history, and health status. The smart helmet is a platform capable of generating legibility across spectrums and varieties of machine-readable patterns.
Assistive robotics as ambient sanitization
Whereas QR codes function as interoperable gateways between citizens and monitoring systems, and thermal cameras are deployed as trojan horses for panspectral surveillance, a third example of machine vision used for epidemic response exemplifies how the world is made readable for machines themselves: everyday robotics. Alongside remote-controlled drones, autonomous robots of various shapes and sizes have dominated early reporting about China’s COVID-19 outbreak. Mostly deployed in hospital wards and quarantine centers as assistive technologies in virtue of their innate viral immunity, these machines have become the (stylized and user-friendly) mascots of the country’s newfound resilience through automation.
The buzz around robotics is another case in which the epidemic state of exception provided an unexpected scenario for technologies fostered by long-term national development strategies. As Sarah O’Meara notes, the Chinese government has stressed the importance of robotics since at least 2006, with the publication of the fifteen-year plan for science and technology, and Chinese academic research in robotics and automation has burgeoned over the past two decades (2020a). The COVID-19 epidemic has only accelerated this trend, but because of the high cost of precision medical robots, experimentation has focused on more affordable homegrown models such as the twelve service robots installed by Beijing-based company CloudMinds in a ward of Wuchang Hospital in Wuhan at the end of February, programmed to disinfect surfaces, deliver food and medication, and measure patient temperatures (O’Meara, 2020b).
Since the beginning of the coronavirus epidemic, Chinese companies like CloudMinds, Keenon, Iben, and OrionStar have shifted their focus towards developing service robots dedicated to disinfection, delivery, basic diagnostics, and patient interaction which are designed to operate mostly on hospital or hotel floors and share similar form factors—minimally anthropomorphic cabinets or trays on rubberized wheels. The kind of machine vision powering them overlaps only partially with the examples discussed above: while some of these service robots might be equipped with thermal and face recognition cameras, the core applications of their optical systems are spatial navigation and interaction. Many of the robots debuted in COVID-19 hospitals and quarantine wards across China were in fact powered by specialized 3D cameras developed by US company Orbbec and characterized by relatively low resolutions (640 x 480) and low power consumption (Scimeca, 2020).
The main rationale behind this sort of automated solution is to delegate basic tasks like disinfection, delivery, and patient interaction to machines in order to minimize human exposure to the virus in high-risk spaces. Machine vision enables the specialized mobility required for these robots to operate, and is mainly directed towards making their surroundings legible to them. Preliminary analyses of assistive robotics observe how these technologies respond to the “physical distancing logics of virus management” (Chen et al., 2020, p. 239) while also functioning as promotional demonstrations of a future “robotic infrastructure” yet to be scaled up. As these robots enjoy their moment of glory in trade fair showcases, the growth of China’s service automation industry hints at the possible development of what some commentators are starting to call wuren jingji, or the “unmanned economy” to come.
From legibility to machine-readability
It would be relatively straightforward to read each of the examples outlined above through the main concepts proposed in James C. Scott’s Seeing Like a State: abstraction, simplification, legibility, and a high modernist ideology oriented toward large-scale state intervention. And while it is important to continue asking what it means to see like an AR goggle, a diagnostic algorithm, a nanobot, and so on, it is also necessary to inquire more broadly about how these technologies intersect and coalesce into more complex optical infrastructures, and how these end up being governed.
First and foremost, the point made by Scott’s critics still stands, given how prominently capitalism and the tech industry are involved in the development and implementation of machine vision, both challenging and reinforcing the state’s bureaucratic claim to its synoptic position. Secondly, machine vision technologies do not necessarily see like a state: their abstractions do not always result in the “resolute singularity” of high-modernist optics (Scott, 1998, p. 347), and the legibility they produce is not bound to lose detail – as a matter of fact, many of these technologies operate by triangulating datasets and patterns across spectrums, layering predictions and producing aggregate knowledge that is often more detailed than the sum of its parts.
The case of China, which happened to be the first country to be hit by the COVID-19 pandemic and is one of the major players in the global AI industry, exemplifies how machine vision can be enrolled as a solution to the invisibility of contagion, allowing governments to visibilize health, manage societal flows, and avoid contact between viral vectors. Inasmuch as individual applications can be easily dismissed as technological solutionism—QR codes of dubious efficacy, thermal measurement as pretext for securitization, and robots as commercial showcase—the compound effect resulting from the deployment of these systems hints at a near future in which the governance of vision machines will become a matter of essential social relevance.
Where the state’s ways of seeing its subjects and jurisdiction produces a legibility derived from selective abstraction, machine vision technologies multiply and layer legibility drawing on different details, patterns, resolutions, and spectrums. Following Jack Stilgoe’s analysis of autonomous cars, it can be argued that this emerging optical infrastructure is predicated on a widespread machine-readability, a new kind of legibility that transcends the autonomous view of any singular technology and brings its own demands for regulation and governance (2017, p. 16). From this follows the emergence of optical governance, which actively negotiates how the world is read by, and made readable to, increasingly complex assemblages of vision machines.
Optical governance is also not exclusively the domain of states: with varying degrees of agency, international bodies, companies, and users all participate in the production of readability, and thus have a stake in negotiations about its purchase. These negotiations will keep taking place as optical infrastructure comes to the foreground and recedes into the background: in June 2020, the sudden disappearance of all sorts of temperature measurement checkpoints in Beijing perceptively noticed by science fiction author Han Song does not mean that these technologies have served their purpose and will be decommissioned. Some of them, like personal QR codes, are already being considered as legacy gateways for post-pandemic use in health monitoring or disaster relief. Others, like smart helmets and delivery robots, will provide platforms for the production of machine-readabilities yet unseen. Optical governance is indispensable to determine which uses these are put to.
Cover image: Depositphotos
Gabriele de Seta
Gabriele de Seta is a media anthropologist. He holds a Ph.D. in sociology from Hong Kong Polytechnic University and was a postdoctoral fellow at the Academia Sinica Institute of Ethnology in Taipei. He is currently a postdoctoral researcher at the University of Bergen as part of the ERC-funded project Machine Vision in Everyday Life. His research work, grounded on ethnographic engagement across multiple sites, focuses on digital media practices and vernacular creativity in China. He is also interested in experimental music, internet art, and collaborative intersections between anthropology and art practice. More information is available on his website.
References:
Amoore, L. (2018). Cloud geographies: Computing, data, sovereignty. Progress in Human Geography, 42(1), 4–24.
Chen, B., Marvin, S., & While, A. (2020). Containing COVID-19 in China: AI and the robotic restructuring of future cities. Dialogues in Human Geography, 10(2), 238–241.
Coronil, F. (2001). Smelling like a market. The American Historical Review, 106(1), 119–129.
Creemers, R. (2017). Cyber China: Upgrading propaganda, public opinion work and social management for the twenty-first century. Journal of Contemporary China, 26(103), 85–100.
Ferguson, J. (2005). Seeing like an oil company: Space, security, and global capital in neoliberal Africa. American Anthropologist, 107(3), 377–382.
Ferguson, J., & Gupta, A. (2002). Spatializing states: Toward an ethnography of neoliberal governmentality. American Ethnologist, 29(4), 981–1002.
Li, T. M. (2005). Beyond “the state” and failed schemes. American Anthropologist, 107(3), 383–394.
MacKenzie, A., & Munster, A. (2019). Platform seeing: Image ensembles and their invisualities. Theory, Culture & Society, 36(5), 3–22.
Magnusson, W. (2011). Politics of urbanism: Seeing like a city (1st ed.). Routledge.
O’Meara, S. (2020a). Medical robotics on the rise. Nature, 582(7813), S51–S52.
O’Meara, S. (2020b). Bill Huang, robot engineer. Nature, 582(7813), S53.
Scimeca, D. (2020, May 7). 3D-vision-guided robots assist in COVID-19 measures in China. Vision Systems Design.
Scott, J. C. (1998). Seeing like a state: How certain schemes to improve the human condition have failed. Yale University Press.
Stilgoe, J. (2017). Seeing like a Tesla: How can we anticipate self-driving worlds? Glocalism: Journal of Culture, Politics and Innovation, 3, 1–20.
Tréguer, F. (2019). Seeing like Big Tech: Security assemblages, technology, and the future of state bureaucracy. In D. Bigo, E. Isin, & E. Ruppert (Eds.), Data politics: Worlds, subjects, rights (pp. 145–164). Routledge.
Uliasz, R. (2020). Seeing like an algorithm: Operative images and emergent subjects. AI & Society, 1–9.
Valverde, M. (2011). Seeing like a city: The dialectic of modern and premodern ways of seeing in urban governance. Law & Society Review, 45(2), 277–312.