[NB:可以在此处找到使用GPT-4的回购的后续。]
大型语言模型(LLM)(例如Chatgpt和Claude)表现出了令人印象深刻的编程能力,并且能够解决各种语言及其分类法的问题。尽管取得了这些成功,但在这些模型表现出对这些语言基础的句法和操作规则的任何基本欣赏程度上,某些怀疑态度仍然存在。
在此及时的工程存储库中,使用任意域特异性语言(DSL)探索ChatGpt的编程能力。 DSL代表了研究LLMS的推理能力的有吸引力的底物,因为它们是新颖的,并且在训练中遇到广泛遇到和记忆的可能性较小。因此,它们可以更直接地测试LLM可以以几种方式推断出新型编程语言规则的程度。
在这里,选择了特定于域的语言SIL(对称集成语言),原因有两个。首先,在培训期间,Chatgpt不太可能接触任何SIL代码,因为这是由技术重型对冲基金(称为Symmetry Investments)开发的内部DSL。其次,作为一种编程语言,该模型具有一些有趣的功能(例如,这是一种强调表达性的功能性语言,但缺乏像Haskell或Ocaml中的let ”。

在一些示例代码提示之后,Chatgpt尝试在小说的功能性DSL中编写代码,称为“ SIL”。
以下是一个提示的集合,其中包含SIL代码的简短示例,这些sil代码突出了其功能。在提示任务并提供SIL代码示例的Chatgpt(请参见下图;完整的提示历史记录和SIL代码示例)之后,我要求它在SIL中实现许多主流编程任务。
在下面的一节中,我显示了一些示例SIL代码脚本,并在此提示了该模型(可以在此处找到完整的示例)及其在SIL中实现各种问题的尝试。
第一个提示是处理一些邮件服务器功能的脚本。因此,它与我随后会提示chatgpt的问题类型有很大不同,但它确实说明了SIL(例如|> )的某些语法,数据结构和功能方面。
// example of using IMAP IDLE to run rules on new mail
import imap
moveMessages(session,ids,target) => if (ids.length > 0 ) then imap.moveUIDs(session,ids,target) else false
login = imap.ImapLogin(environment( " IMAP_USER " ),environment( " IMAP_PASS " ))
server = imap.ImapServer( " imap.fastmail.com " , " 993 " )
session = imap.Session(server,login, true ,imap.Options(debugMode: true )) | > imap.openConnection | > imap.login
rules=[ [
[ " INBOX/0receipts " ,
[
" FROM [email protected] " ,
" FROM interactivebrokers.com " ,
]],
[ " Junk " ,
[
" FROM Tapatalk " ,
]],
[ " INBOX/newsletters " ,
[
" FROM [email protected] " , " HEADER X-mailer mailgun " , " HEADER X-mailer WPMailSMTP/Mailer/mailgun 2.4.0 " , " HEADER X-mailer nlserver " ,
" FROM hbr.org " , " FROM elliottwave.com " , " OR FROM cio.com FROM cio.co.uk " , " FROM substack.com " , " FROM eaglealpha.com " ,
" FROM haaretz.com " , " FROM gavekal.com " , " FROM go.weka.io " , " FROM marketing.weka.io " , ` HEADER list-unsubscribe "" ` ,
` HEADER list-Id "" ` , ` HEADER list-Post "" ` , ` HEADER list-owner"" ` , ` HEADER Precedence bulk ` , ` HEADER Precedence list ` ,
` HEADER list-bounces "" ` , ` HEADER list-help "" ` , ` HEADER List-Unsubscribe "" ` , " FROM no-reply " ,
]],
[ " INBOX/notifications " ,
[
` KEYWORD "$IsNotification" ` , " FROM [email protected] " , " FROM [email protected] " , " FROM skillcast.com " ,
" FROM reedmac.co.uk " , " FROM [email protected] " , " FROM [email protected] " , " FROM [email protected] " ,
]],
]
runRules(Session,Rules) => Rules
| > map(target => [target[ 0 ],(target[ 1 ] | > map(term => imap.search(Session,term).ids))])
| > mapa(set => moveMessages(Session,set[ 1 ] | > join,set[ 0 ]))
runRulesBox(Session,Rules,Mailbox) => {
imap.select(Session,Mailbox)
in runRules (Session,Rules)
}
inboxes=[ " INBOX " ]
result = inboxes | > mapa(inbox => runRulesBox(session,rules,imap.Mailbox(session,inbox)))
print(result)
import parallel;
threadFunction(x) => {
imap.idle(session)
in inboxes | > mapa(inbox => runRulesBox(session,rules,imap.Mailbox(session,inbox)))
}
parallel.runEvents((x) => false ,[threadFunction])第二个示例代码提示类似地旨在突出该DSL的某些功能,并引入了一些新的标准库功能,例如iota和fold 。
import imap
import imap_config
import string
// Get the configuration from the environment and command line.
config = imap_config.getConfig(commandLineArguments)
// -------------------------------------------------------------------------------------------------
// Some helper functions.
//
// Firstly, a function to join an array of strings.
joinFields(flds, sep) => {
len(flds) > 0 | > enforce( " Cannot join an empty array. " )
in fold (flds[ 1 :$], (str, fld) => str ~ sep ~ fld, flds[ 0 ])
}
// Secondly, a field formatter which strips the field prefix and pads to a fixed width.
// E.g., ("From: [email protected]" |> fmtField(20)) == "[email protected] "
fmtField(field, width) => {
pad(str) => iota(width - len(str)) | > fold((a, i) => a ~ " " , str)
in field
| > string .split( " : " )[ 1 :$]
| > joinFields( " : " )
| > pad
}
// And thirdly, a function which concatenates the headers into a formatted string.
fmtHeaders(outStr, headers) => {
outStr ~ " " ~ joinFields(headers, " | " ) ~ " n "
}
// -------------------------------------------------------------------------------------------------
// Connect to the inbox.
creds = imap.ImapLogin(config.user, config.pass)
server = imap.ImapServer(config.host, config.port)
session =
imap.Session(server, creds)
| > imap.openConnection()
| > imap.login()
inbox = imap.Mailbox(session, " INBOX " )
// Get the number of messages in the inbox.
msgCount = imap.status(session, inbox).messages
// Select the default inbox.
inbox | > imap.examine(session, _)
// Get the headers (date, from and subject) for each message, from oldest to newest, format and
// print them.
headers =
iota(msgCount)
| > map(id => " # " ~ toString(id + 1 ))
| > map(id =>
imap.fetchFields(session, id, " date from subject " ).lines
| > map(hdr => fmtField(hdr, 40 )))
| > fold(fmtHeaders, " INBOX: n " )
print(headers)第三个代码示例进一步说明了该DSL的一些异常功能,其目的是接下来将在其自己的实现中使用这些功能。
// This script will search for emails and match new issue numbers with resolutions to report the
// number of outstanding alerts.
import imap
import * from imap.query
import imap_config
import dates
import string
// Get the configuration from the environment and command line.
config = imap_config.getConfig(commandLineArguments)
// Connect to the inbox.
creds = imap.ImapLogin(config.user, config.pass)
server = imap.ImapServer(config.host, config.port)
session =
imap.Session(server, creds)
| > imap.openConnection()
| > imap.login()
inbox = imap.Mailbox(session, " support " )
// Select the default inbox.
inbox | > imap.examine(session, _)
// These criteria are common for both our searches.
commonCrit = imap.Query()
| > and(from( ` [email protected] ` ))
| > and(sentSince(dates. Date ( 2020 , 5 , 13 )))
// Get each of the alerts and resolutions from the past week (13-19 May 2020).
alertMsgIds =
imap.search(session, imap.Query(subject( " Alert: new issue " )) | > and(commonCrit)).ids
resolutionMsgIds =
imap.search(session, imap.Query(subject( " Resolution: issue " )) | > and(commonCrit)).ids
// A function to get the alert ID from a message subject.
getAlertId(msgId) => {
imap.fetchFields(session, toString (msgId), " subject " ).lines[ 0 ]
| > string .split()[$ - 1 ]
}
// A function to remove an entry from a table whether it's there or not.
removeIfExists(tbl, key) => {
if find( keys (tbl), key) == [] then
tbl
else
removeEntry(tbl, key)
}
// Now find those alerts which have no resolution. Firstly the subject for each alert, get the
// issue number end and store it in a table.
allAlertTable = alertMsgIds | > fold((tbl, msgId) => addEntry(tbl, getAlertId(msgId), msgId), {})
// Go through the resolutions and remove their corresponding alerts from the table.
unresolvedAlertTable =
resolutionMsgIds | > fold((tbl, msgId) => removeIfExists(tbl, getAlertId(msgId)), allAlertTable)
// Create a report with the date of the unresolved alerts.
report =
keys (unresolvedAlertTable)
| > map(alertId => {
msgId = unresolvedAlertTable[alertId] | > toString
in [ alertId
, imap.fetchFields(session, msgId, " date " ).lines[ 0 ]
, imap.fetchText(session, msgId).lines[ 0 ]
]
})
| > fold((outStr, tuple) => {
outStr ~ " Issue: " ~ tuple[ 0 ] ~ " n " ~ tuple[ 1 ] ~ " n Summary: " ~ tuple[ 2 ] ~ " n "
}, " n UNRESOLVED ISSUES FROM THE PAST 10 DAYS: nn " )
// Close the session.
imap.closeConnection(session)
// Print the report last.
print(report)作为第一次练习,我要求模型在SIL中实现一个非常简单的关系数据库,并具有基本功能,例如比较和打印两个数据结构之间的差异(请注意, //表示SIL中的评论)。
// Here, ChatGPT was prompted to write a a very simple, schematic relational database in SIL
// This prompt proved too general, perhaps, and after requesting further information, I prompted the model to generate a simple, schematic database with functionalities such as "compare" and "diff" for two or more tables
users = [ { " id " : 1 , " name " : " John Doe " , " email " : " [email protected] " }, { " id " : 2 , " name " : " Jane Smith " , " email " : " [email protected] " }, { " id " : 3 , " name " : " Bob Johnson " , " email " : " [email protected] " },]
orders = [ { " id " : 1 , " user_id " : 1 , " product " : " Widget " , " quantity " : 2 , " price " : 19.99 }, { " id " : 2 , " user_id " : 2 , " product " : " Gadget " , " quantity " : 1 , " price " : 29.99 }, { " id " : 3 , " user_id " : 3 , " product " : " Thingamajig " , " quantity " : 3 , " price " : 39.99 },]
// Compare two tables
compareTables(table1, table2) => {
table1 == table2
}
// Diff two tables
diffTables(table1, table2) => {
table1.filter(i => ! table2.includes(i))
}
print(compareTables(users, users)) // true
print(diffTables(users, orders)) // [{"id": 1, "name": "John Doe", "email": "[email protected]"}]实施排序算法是一个原型软件工程面试问题。在这里,我要求Chatgpt在SIL中实现Bubblesort(概念上最基本的排序算法)。以下是回应:
bubbleSort(arr) => {
for i in 1 ... len(arr) {
for j in 1 ... len(arr) - i {
if (arr[j] > arr[j + 1 ]) {
temp = arr[j]
arr[j] = arr[j + 1 ]
arr[j + 1 ] = temp
}
}
}
arr
}
arr = [ 4 , 6 , 1 , 3 , 5 , 2 ]
print(bubbleSort(arr))尽管该脚本确实是Bubblesort的工作实现,但该模型广泛地忽略了SIL的功能方面。例如,这种实现Bubblesort使用for循环而不是递归的实现。
接下来,我提示该模型编写一个用于计算列表中值的脚本(下面生成的响应)。这也许是最不令人满意的反应之一。首先, if/else我提示该模型的示例SIL代码中具有示例SIL代码中的语句,但此生成的代码中的语法是不正确的。其次,chatgpt幻觉(或假设)存在称为math和建造功能sort模块的存在(这是合理的假设,但它确实显着降低了提示问题的复杂性)。
// Notably, despite if/else statements featuring in the code prompts used for training, ChatGPT gets the syntax wrong here
import math
// Function to compute the median of a list of integers
median(numbers) => {
// Sort the list of numbers
sortedNumbers = sort(numbers)
// Get the middle index of the sorted list
middleIndex = math.floor(len(sortedNumbers) / 2 )
// If the list has an odd number of elements, return the middle element
if len(sortedNumbers) % 2 == 1 :
return sortedNumbers[middleIndex]
// Otherwise, return the average of the two middle elements
else :
return (sortedNumbers[middleIndex - 1 ] + sortedNumbers[middleIndex]) / 2
}
接下来提示Chatgpt在SIL中实现K -Neart最邻居算法。这实际上是两个提示的结果(如下生成的代码块中所述),因为模型幻觉/假设第一次尝试中的模块称为distances 。在第二次尝试中,我提示它实现算法而不假设存在该模块。
Chatgpt的表现混合在一起。首先,在某些情况下,它忽略了SIL的功能方面(例如,在第一次尝试中实现A for 。但是,在实现的其他部分中,它确实赞赏SIL具有功能性,并且使用|>和map运算符准确地使用。尤其是,该模型确实是幻觉是幻觉let幻觉的,而在示例中不符合该限制的互联网或fip the the conty the conty the conty the sil the sil the workation the功能性的功能。
// Note that this is the fusion of two separate scripts generated by ChatGPT to solve the k-nearest neighbours algorithm, as described in the comments below
// Overall, these attempts seem reasonable, but it does appear to hallucinate "let" bindings which do not feature in the example code with which it was prompted. This is likely due to conflating the functionality of SIL code with OCaml or F#
// Attempt 1:
// This is the first ChatGPT attempt when prompted to generate a script for the k-nearest neighbours algorithm
// In this case, it assumes (or hallucinates) a library called "distances"
import distances
// Define a point as a tuple of x and y coordinates
type Point = (num, num)
// Find the distance between two points
distance(p1: Point, p2: Point) => {
let xDistance = p1[ 0 ] - p2[ 0 ]
let yDistance = p1[ 1 ] - p2[ 1 ]
in sqrt (xDistance * xDistance + yDistance * yDistance)
}
// Find the k nearest points to a given point
kNearest(point: Point, points: [Point], k: num) => {
// Compute the distances between the given point and all other points
let distances = points | > map(p => (p, distance(point, p)))
// Sort the points by distance
distances = distances | > sort((d1, d2) => d1[ 1 ] < d2[ 1 ])
// Return the k closest points
in distances[:k] | > map(d => d[ 0 ])
}
// Attempt 2:
// Next, ChatGPT is prompted to generate a script without importing the questionable "distances" library
// Again, it assumes some in-built functions, such as "sort" or "sqrt"
// Function to calculate the distance between two points
distance(p1, p2) => {
sum = 0
for i in 0 .. len(p1) - 1 {
sum = sum + (p1[i] - p2[i]) ^ 2
}
sqrt(sum)
}
// Function to find the k nearest neighbors
k_nearest_neighbors(data, point, k) => {
distances = data | > map(p => (p, distance(p, point)))
distances | > sort((a, b) => a[ 1 ] < b[ 1 ]) | > take(k) | > map(p => p[ 0 ])
}
功能语言的有趣特征之一是它们可以具有很高的表现力。在这里,我提示Chatgpt实现基本的功能运算map , filter和reduce (或在功能语言的上下文中fold )。在最初的几种情况下,该模型似乎失去了SIL功能方面的跟踪,但在提示时可以以更有功能的方式重新实现代码。
// map
map(fn, list) => {
result = []
for i in list {
result.append(fn(i))
}
return result
}
// filter
filter(fn, list) => {
result = []
for i in list {
if (fn(i)) {
result.append(i)
}
}
return result
}
// reduce
reduce(fn, list, initial) => {
result = initial
for i in list {
result = fn(result, i)
}
return result
}
// reduce (prompted to use functional operators such as pipe |>, in addition to the 'map' and 'filter' functions defined previously)
reduce(fn, list, initial) =>
list | >
filter(fn, _, initial) =>
map(fn, _, initial) =>
initialEuler项目是数学和编程问题的集合(我认为这是LLM的有趣提示)。在这里,我提示Chatgpt使用SIL解决前两个项目Euler问题。在这一点上,Chatgpt的Reponss有些不利。这些解决方案都与SIL的高级功能和其ExaxCT语法都没有很大相似之处。此时提供进一步的SIL代码示例,然后重新提交模型以解决这些问题可能很有趣。
// Project Euler problem #1:
// Here, my prompt was to solve the (easy) first problem of the Project Euler series (https://projecteuler.net/problem=1)
// Notably, despite if/else statements featuring in the code prompts used for training, ChatGPT gets the syntax wrong here
result = 0
for i in 1 ... 1000 {
if (i % 3 == 0 || i % 5 == 0 ) {
result += i
}
}
print(result) // Project Euler problem #2:
// This prompt is to solve the second Project Euler problem (https://projecteuler.net/problem=2)
// The solution proposed by ChatGPT is sensible, but potentially inefficient, and ignores the recursion of functional languages
// Notably, despite if/else statements featuring in the code prompts used for training, ChatGPT gets the syntax wrong here
result = 0
fib(n) => {
if (n <= 1 ) {
return n
}
return fib (n - 1 ) + fib(n - 2 )
}
for i in 1 ... 100 {
val = fib(i)
if (val > 4000000 ) {
break
}
if (val % 2 == 0 ) {
result += val
}
}
print(result)
探索LLMs以几种方式推断和理解新型编程语言特征的能力仍然是一个开放而有趣的问题。在这里,这些能力是在促使Chatgpt解决的促使特定领域特定语言(DSL)的问题的背景下进行了探索的。 DSL是探索LLMS中推理与记忆的潜在有用的测试用例,因为它们通常具有独特的特征,并且在训练期间(如果有的话)中遇到了广泛的特征。
总体而言,性能是混合的:Chatgpt正确理解该DSL是一种功能性编程语言(尽管有时必须重新提出此DSL),并相应地实施了其解决方案。但是,它广泛未能从提供的五个示例脚本中捕获该DSL的句法规则。进一步的研究(例如使用其他DSL,开发一些更正式的指标来评估LLM生成的代码,或量化模型的学习动力)将使此存储库的有趣扩展。最后,在一个单独的ChatGpt会话中,我提供了此元数据,以确定该模型如何解释其自己的SIL代码:

Chatgpt研究了自己在特定于域的语言SIL中编写代码的尝试,并描述了其一些功能。
有证据表明,LLM记忆是通过训练示例演示的频率以及用于提示模型的相关令牌的数量来促进的。 ↩