[NB:可以在此處找到使用GPT-4的回購的後續。 ]
大型語言模型(LLM)(例如Chatgpt和Claude)表現出了令人印象深刻的編程能力,並且能夠解決各種語言及其分類法的問題。儘管取得了這些成功,但在這些模型表現出對這些語言基礎的句法和操作規則的任何基本欣賞程度上,某些懷疑態度仍然存在。
在此及時的工程存儲庫中,使用任意域特異性語言(DSL)探索ChatGpt的編程能力。 DSL代表了研究LLMS的推理能力的有吸引力的底物,因為它們是新穎的,並且在訓練中遇到廣泛遇到和記憶的可能性較小。因此,它們可以更直接地測試LLM可以以幾種方式推斷出新型編程語言規則的程度。
在這裡,選擇了特定於域的語言SIL(對稱集成語言),原因有兩個。首先,在培訓期間,Chatgpt不太可能接觸任何SIL代碼,因為這是由技術重型對沖基金(稱為Symmetry Investments)開發的內部DSL。其次,作為一種編程語言,該模型具有一些有趣的功能(例如,這是一種強調表達性的功能性語言,但缺乏像Haskell或Ocaml中的let ”。

在一些示例代碼提示之後,Chatgpt嘗試在小說的功能性DSL中編寫代碼,稱為“ SIL”。
以下是一個提示的集合,其中包含SIL代碼的簡短示例,這些sil代碼突出了其功能。在提示任務並提供SIL代碼示例的Chatgpt(請參見下圖;完整的提示歷史記錄和SIL代碼示例)之後,我要求它在SIL中實現許多主流編程任務。
在下面的一節中,我顯示了一些示例SIL代碼腳本,並在此提示了該模型(可以在此處找到完整的示例)及其在SIL中實現各種問題的嘗試。
第一個提示是處理一些郵件服務器功能的腳本。因此,它與我隨後會提示chatgpt的問題類型有很大不同,但它確實說明了SIL(例如|> )的某些語法,數據結構和功能方面。
// example of using IMAP IDLE to run rules on new mail
import imap
moveMessages(session,ids,target) => if (ids.length > 0 ) then imap.moveUIDs(session,ids,target) else false
login = imap.ImapLogin(environment( " IMAP_USER " ),environment( " IMAP_PASS " ))
server = imap.ImapServer( " imap.fastmail.com " , " 993 " )
session = imap.Session(server,login, true ,imap.Options(debugMode: true )) | > imap.openConnection | > imap.login
rules=[ [
[ " INBOX/0receipts " ,
[
" FROM [email protected] " ,
" FROM interactivebrokers.com " ,
]],
[ " Junk " ,
[
" FROM Tapatalk " ,
]],
[ " INBOX/newsletters " ,
[
" FROM [email protected] " , " HEADER X-mailer mailgun " , " HEADER X-mailer WPMailSMTP/Mailer/mailgun 2.4.0 " , " HEADER X-mailer nlserver " ,
" FROM hbr.org " , " FROM elliottwave.com " , " OR FROM cio.com FROM cio.co.uk " , " FROM substack.com " , " FROM eaglealpha.com " ,
" FROM haaretz.com " , " FROM gavekal.com " , " FROM go.weka.io " , " FROM marketing.weka.io " , ` HEADER list-unsubscribe "" ` ,
` HEADER list-Id "" ` , ` HEADER list-Post "" ` , ` HEADER list-owner"" ` , ` HEADER Precedence bulk ` , ` HEADER Precedence list ` ,
` HEADER list-bounces "" ` , ` HEADER list-help "" ` , ` HEADER List-Unsubscribe "" ` , " FROM no-reply " ,
]],
[ " INBOX/notifications " ,
[
` KEYWORD "$IsNotification" ` , " FROM [email protected] " , " FROM [email protected] " , " FROM skillcast.com " ,
" FROM reedmac.co.uk " , " FROM [email protected] " , " FROM [email protected] " , " FROM [email protected] " ,
]],
]
runRules(Session,Rules) => Rules
| > map(target => [target[ 0 ],(target[ 1 ] | > map(term => imap.search(Session,term).ids))])
| > mapa(set => moveMessages(Session,set[ 1 ] | > join,set[ 0 ]))
runRulesBox(Session,Rules,Mailbox) => {
imap.select(Session,Mailbox)
in runRules (Session,Rules)
}
inboxes=[ " INBOX " ]
result = inboxes | > mapa(inbox => runRulesBox(session,rules,imap.Mailbox(session,inbox)))
print(result)
import parallel;
threadFunction(x) => {
imap.idle(session)
in inboxes | > mapa(inbox => runRulesBox(session,rules,imap.Mailbox(session,inbox)))
}
parallel.runEvents((x) => false ,[threadFunction])第二個示例代碼提示類似地旨在突出該DSL的某些功能,並引入了一些新的標準庫功能,例如iota和fold 。
import imap
import imap_config
import string
// Get the configuration from the environment and command line.
config = imap_config.getConfig(commandLineArguments)
// -------------------------------------------------------------------------------------------------
// Some helper functions.
//
// Firstly, a function to join an array of strings.
joinFields(flds, sep) => {
len(flds) > 0 | > enforce( " Cannot join an empty array. " )
in fold (flds[ 1 :$], (str, fld) => str ~ sep ~ fld, flds[ 0 ])
}
// Secondly, a field formatter which strips the field prefix and pads to a fixed width.
// E.g., ("From: [email protected]" |> fmtField(20)) == "[email protected] "
fmtField(field, width) => {
pad(str) => iota(width - len(str)) | > fold((a, i) => a ~ " " , str)
in field
| > string .split( " : " )[ 1 :$]
| > joinFields( " : " )
| > pad
}
// And thirdly, a function which concatenates the headers into a formatted string.
fmtHeaders(outStr, headers) => {
outStr ~ " " ~ joinFields(headers, " | " ) ~ " n "
}
// -------------------------------------------------------------------------------------------------
// Connect to the inbox.
creds = imap.ImapLogin(config.user, config.pass)
server = imap.ImapServer(config.host, config.port)
session =
imap.Session(server, creds)
| > imap.openConnection()
| > imap.login()
inbox = imap.Mailbox(session, " INBOX " )
// Get the number of messages in the inbox.
msgCount = imap.status(session, inbox).messages
// Select the default inbox.
inbox | > imap.examine(session, _)
// Get the headers (date, from and subject) for each message, from oldest to newest, format and
// print them.
headers =
iota(msgCount)
| > map(id => " # " ~ toString(id + 1 ))
| > map(id =>
imap.fetchFields(session, id, " date from subject " ).lines
| > map(hdr => fmtField(hdr, 40 )))
| > fold(fmtHeaders, " INBOX: n " )
print(headers)第三個代碼示例進一步說明了該DSL的一些異常功能,其目的是接下來將在其自己的實現中使用這些功能。
// This script will search for emails and match new issue numbers with resolutions to report the
// number of outstanding alerts.
import imap
import * from imap.query
import imap_config
import dates
import string
// Get the configuration from the environment and command line.
config = imap_config.getConfig(commandLineArguments)
// Connect to the inbox.
creds = imap.ImapLogin(config.user, config.pass)
server = imap.ImapServer(config.host, config.port)
session =
imap.Session(server, creds)
| > imap.openConnection()
| > imap.login()
inbox = imap.Mailbox(session, " support " )
// Select the default inbox.
inbox | > imap.examine(session, _)
// These criteria are common for both our searches.
commonCrit = imap.Query()
| > and(from( ` [email protected] ` ))
| > and(sentSince(dates. Date ( 2020 , 5 , 13 )))
// Get each of the alerts and resolutions from the past week (13-19 May 2020).
alertMsgIds =
imap.search(session, imap.Query(subject( " Alert: new issue " )) | > and(commonCrit)).ids
resolutionMsgIds =
imap.search(session, imap.Query(subject( " Resolution: issue " )) | > and(commonCrit)).ids
// A function to get the alert ID from a message subject.
getAlertId(msgId) => {
imap.fetchFields(session, toString (msgId), " subject " ).lines[ 0 ]
| > string .split()[$ - 1 ]
}
// A function to remove an entry from a table whether it's there or not.
removeIfExists(tbl, key) => {
if find( keys (tbl), key) == [] then
tbl
else
removeEntry(tbl, key)
}
// Now find those alerts which have no resolution. Firstly the subject for each alert, get the
// issue number end and store it in a table.
allAlertTable = alertMsgIds | > fold((tbl, msgId) => addEntry(tbl, getAlertId(msgId), msgId), {})
// Go through the resolutions and remove their corresponding alerts from the table.
unresolvedAlertTable =
resolutionMsgIds | > fold((tbl, msgId) => removeIfExists(tbl, getAlertId(msgId)), allAlertTable)
// Create a report with the date of the unresolved alerts.
report =
keys (unresolvedAlertTable)
| > map(alertId => {
msgId = unresolvedAlertTable[alertId] | > toString
in [ alertId
, imap.fetchFields(session, msgId, " date " ).lines[ 0 ]
, imap.fetchText(session, msgId).lines[ 0 ]
]
})
| > fold((outStr, tuple) => {
outStr ~ " Issue: " ~ tuple[ 0 ] ~ " n " ~ tuple[ 1 ] ~ " n Summary: " ~ tuple[ 2 ] ~ " n "
}, " n UNRESOLVED ISSUES FROM THE PAST 10 DAYS: nn " )
// Close the session.
imap.closeConnection(session)
// Print the report last.
print(report)作為第一次練習,我要求模型在SIL中實現一個非常簡單的關係數據庫,並具有基本功能,例如比較和打印兩個數據結構之間的差異(請注意, //表示SIL中的評論)。
// Here, ChatGPT was prompted to write a a very simple, schematic relational database in SIL
// This prompt proved too general, perhaps, and after requesting further information, I prompted the model to generate a simple, schematic database with functionalities such as "compare" and "diff" for two or more tables
users = [ { " id " : 1 , " name " : " John Doe " , " email " : " [email protected] " }, { " id " : 2 , " name " : " Jane Smith " , " email " : " [email protected] " }, { " id " : 3 , " name " : " Bob Johnson " , " email " : " [email protected] " },]
orders = [ { " id " : 1 , " user_id " : 1 , " product " : " Widget " , " quantity " : 2 , " price " : 19.99 }, { " id " : 2 , " user_id " : 2 , " product " : " Gadget " , " quantity " : 1 , " price " : 29.99 }, { " id " : 3 , " user_id " : 3 , " product " : " Thingamajig " , " quantity " : 3 , " price " : 39.99 },]
// Compare two tables
compareTables(table1, table2) => {
table1 == table2
}
// Diff two tables
diffTables(table1, table2) => {
table1.filter(i => ! table2.includes(i))
}
print(compareTables(users, users)) // true
print(diffTables(users, orders)) // [{"id": 1, "name": "John Doe", "email": "[email protected]"}]實施排序算法是一個原型軟件工程面試問題。在這裡,我要求Chatgpt在SIL中實現Bubblesort(概念上最基本的排序算法)。以下是回應:
bubbleSort(arr) => {
for i in 1 ... len(arr) {
for j in 1 ... len(arr) - i {
if (arr[j] > arr[j + 1 ]) {
temp = arr[j]
arr[j] = arr[j + 1 ]
arr[j + 1 ] = temp
}
}
}
arr
}
arr = [ 4 , 6 , 1 , 3 , 5 , 2 ]
print(bubbleSort(arr))儘管該腳本確實是Bubblesort的工作實現,但該模型廣泛地忽略了SIL的功能方面。例如,這種實現Bubblesort使用for循環而不是遞歸的實現。
接下來,我提示該模型編寫一個用於計算列表中值的腳本(下面生成的響應)。這也許是最不令人滿意的反應之一。首先, if/else我提示該模型的示例SIL代碼中具有示例SIL代碼中的語句,但此生成的代碼中的語法是不正確的。其次,chatgpt幻覺(或假設)存在稱為math和建造功能sort模塊的存在(這是合理的假設,但它確實顯著降低了提示問題的複雜性)。
// Notably, despite if/else statements featuring in the code prompts used for training, ChatGPT gets the syntax wrong here
import math
// Function to compute the median of a list of integers
median(numbers) => {
// Sort the list of numbers
sortedNumbers = sort(numbers)
// Get the middle index of the sorted list
middleIndex = math.floor(len(sortedNumbers) / 2 )
// If the list has an odd number of elements, return the middle element
if len(sortedNumbers) % 2 == 1 :
return sortedNumbers[middleIndex]
// Otherwise, return the average of the two middle elements
else :
return (sortedNumbers[middleIndex - 1 ] + sortedNumbers[middleIndex]) / 2
}
接下來提示Chatgpt在SIL中實現K -Neart最鄰居算法。這實際上是兩個提示的結果(如下生成的代碼塊中所述),因為模型幻覺/假設第一次嘗試中的模塊稱為distances 。在第二次嘗試中,我提示它實現算法而不假設存在該模塊。
Chatgpt的表現混合在一起。首先,在某些情況下,它忽略了SIL的功能方面(例如,在第一次嘗試中實現A for 。但是,在實現的其他部分中,它確實讚賞SIL具有功能性,並且使用|>和map運算符準確地使用。尤其是,該模型確實是幻覺是幻覺let幻覺的,而在示例中不符合該限制的互聯網或fip the the conty the conty the conty the sil the sil the workation the功能性的功能。
// Note that this is the fusion of two separate scripts generated by ChatGPT to solve the k-nearest neighbours algorithm, as described in the comments below
// Overall, these attempts seem reasonable, but it does appear to hallucinate "let" bindings which do not feature in the example code with which it was prompted. This is likely due to conflating the functionality of SIL code with OCaml or F#
// Attempt 1:
// This is the first ChatGPT attempt when prompted to generate a script for the k-nearest neighbours algorithm
// In this case, it assumes (or hallucinates) a library called "distances"
import distances
// Define a point as a tuple of x and y coordinates
type Point = (num, num)
// Find the distance between two points
distance(p1: Point, p2: Point) => {
let xDistance = p1[ 0 ] - p2[ 0 ]
let yDistance = p1[ 1 ] - p2[ 1 ]
in sqrt (xDistance * xDistance + yDistance * yDistance)
}
// Find the k nearest points to a given point
kNearest(point: Point, points: [Point], k: num) => {
// Compute the distances between the given point and all other points
let distances = points | > map(p => (p, distance(point, p)))
// Sort the points by distance
distances = distances | > sort((d1, d2) => d1[ 1 ] < d2[ 1 ])
// Return the k closest points
in distances[:k] | > map(d => d[ 0 ])
}
// Attempt 2:
// Next, ChatGPT is prompted to generate a script without importing the questionable "distances" library
// Again, it assumes some in-built functions, such as "sort" or "sqrt"
// Function to calculate the distance between two points
distance(p1, p2) => {
sum = 0
for i in 0 .. len(p1) - 1 {
sum = sum + (p1[i] - p2[i]) ^ 2
}
sqrt(sum)
}
// Function to find the k nearest neighbors
k_nearest_neighbors(data, point, k) => {
distances = data | > map(p => (p, distance(p, point)))
distances | > sort((a, b) => a[ 1 ] < b[ 1 ]) | > take(k) | > map(p => p[ 0 ])
}
功能語言的有趣特徵之一是它們可以具有很高的表現力。在這裡,我提示Chatgpt實現基本的功能運算map , filter和reduce (或在功能語言的上下文中fold )。在最初的幾種情況下,該模型似乎失去了SIL功能方面的跟踪,但在提示時可以以更有功能的方式重新實現代碼。
// map
map(fn, list) => {
result = []
for i in list {
result.append(fn(i))
}
return result
}
// filter
filter(fn, list) => {
result = []
for i in list {
if (fn(i)) {
result.append(i)
}
}
return result
}
// reduce
reduce(fn, list, initial) => {
result = initial
for i in list {
result = fn(result, i)
}
return result
}
// reduce (prompted to use functional operators such as pipe |>, in addition to the 'map' and 'filter' functions defined previously)
reduce(fn, list, initial) =>
list | >
filter(fn, _, initial) =>
map(fn, _, initial) =>
initialEuler項目是數學和編程問題的集合(我認為這是LLM的有趣提示)。在這裡,我提示Chatgpt使用SIL解決前兩個項目Euler問題。在這一點上,Chatgpt的Reponss有些不利。這些解決方案都與SIL的高級功能和其ExaxCT語法都沒有很大相似之處。此時提供進一步的SIL代碼示例,然後重新提交模型以解決這些問題可能很有趣。
// Project Euler problem #1:
// Here, my prompt was to solve the (easy) first problem of the Project Euler series (https://projecteuler.net/problem=1)
// Notably, despite if/else statements featuring in the code prompts used for training, ChatGPT gets the syntax wrong here
result = 0
for i in 1 ... 1000 {
if (i % 3 == 0 || i % 5 == 0 ) {
result += i
}
}
print(result) // Project Euler problem #2:
// This prompt is to solve the second Project Euler problem (https://projecteuler.net/problem=2)
// The solution proposed by ChatGPT is sensible, but potentially inefficient, and ignores the recursion of functional languages
// Notably, despite if/else statements featuring in the code prompts used for training, ChatGPT gets the syntax wrong here
result = 0
fib(n) => {
if (n <= 1 ) {
return n
}
return fib (n - 1 ) + fib(n - 2 )
}
for i in 1 ... 100 {
val = fib(i)
if (val > 4000000 ) {
break
}
if (val % 2 == 0 ) {
result += val
}
}
print(result)
探索LLMs以幾種方式推斷和理解新型編程語言特徵的能力仍然是一個開放而有趣的問題。在這裡,這些能力是在促使Chatgpt解決的促使特定領域特定語言(DSL)的問題的背景下進行了探索的。 DSL是探索LLMS中推理與記憶的潛在有用的測試用例,因為它們通常具有獨特的特徵,並且在訓練期間(如果有的話)中遇到了廣泛的特徵。
總體而言,性能是混合的:Chatgpt正確理解該DSL是一種功能性編程語言(儘管有時必須重新提出此DSL),並相應地實施了其解決方案。但是,它廣泛未能從提供的五個示例腳本中捕獲該DSL的句法規則。進一步的研究(例如使用其他DSL,開發一些更正式的指標來評估LLM生成的代碼,或量化模型的學習動力)將使此存儲庫的有趣擴展。最後,在一個單獨的ChatGpt會話中,我提供了此元數據,以確定該模型如何解釋其自己的SIL代碼:

Chatgpt研究了自己在特定於域的語言SIL中編寫代碼的嘗試,並描述了其一些功能。
有證據表明,LLM記憶是通過訓練示例演示的頻率以及用於提示模型的相關令牌的數量來促進的。 ↩